pod/perlsub.pod

   1 =head1 NAME
   2
   3 perlsub - Perl subroutines
   4
   5 =head1 SYNOPSIS
   6
   7 To declare subroutines:
   8
   9     sub NAME;             # A "forward" declaration.
  10     sub NAME(PROTO);      #  ditto, but with prototypes
  11
  12     sub NAME BLOCK        # A declaration and a definition.
  13     sub NAME(PROTO) BLOCK #  ditto, but with prototypes
  14
  15 To define an anonymous subroutine at runtime:
  16
  17     $subref = sub BLOCK;
  18
  19 To import subroutines:
  20
  21     use PACKAGE qw(NAME1 NAME2 NAME3);
  22
  23 To call subroutines:
  24
  25     NAME(LIST);    # & is optional with parentheses.
  26     NAME LIST;     # Parentheses optional if pre-declared/imported.
  27     &NAME;         # Passes current @_ to subroutine.
  28
  29 =head1 DESCRIPTION
  30
  31 Like many languages, Perl provides for user-defined subroutines.  These
  32 may be located anywhere in the main program, loaded in from other files
  33 via the C<do>, C<require>, or C<use> keywords, or even generated on the
  34 fly using C<eval> or anonymous subroutines (closures).  You can even call
  35 a function indirectly using a variable containing its name or a CODE reference
  36 to it, as in C<$var = \&function>.
  37
  38 The Perl model for function call and return values is simple: all
  39 functions are passed as parameters one single flat list of scalars, and
  40 all functions likewise return to their caller one single flat list of
  41 scalars.  Any arrays or hashes in these call and return lists will
  42 collapse, losing their identities--but you may always use
  43 pass-by-reference instead to avoid this.  Both call and return lists may
  44 contain as many or as few scalar elements as you'd like.  (Often a
  45 function without an explicit return statement is called a subroutine, but
  46 there's really no difference from the language's perspective.)
  47
  48 Any arguments passed to the routine come in as the array @_.  Thus if you
  49 called a function with two arguments, those would be stored in C<$_[0]>
  50 and C<$_[1]>.  The array @_ is a local array, but its values are implicit
  51 references (predating L<perlref>) to the actual scalar parameters.  The
  52 return value of the subroutine is the value of the last expression
  53 evaluated.  Alternatively, a return statement may be used to specify the
  54 returned value and exit the subroutine.  If you return one or more arrays
  55 and/or hashes, these will be flattened together into one large
  56 indistinguishable list.
  57
  58 Perl does not have named formal parameters, but in practice all you do is
  59 assign to a my() list of these.  Any variables you use in the function
  60 that aren't declared private are global variables.  For the gory details
  61 on creating private variables, see
  62 L<"Private Variables via my()"> and L<"Temporary Values via local()">.
  63 To create protected environments for a set of functions in a separate
  64 package (and probably a separate file), see L<perlmod/"Packages">.
  65
  66 Example:
  67
  68     sub max {
  69         my $max = shift(@_);
  70         foreach $foo (@_) {
  71             $max = $foo if $max < $foo;
  72         }
  73         return $max;
  74     }
  75     $bestday = max($mon,$tue,$wed,$thu,$fri);
  76
  77 Example:
  78
  79     # get a line, combining continuation lines
  80     #  that start with whitespace
  81
  82     sub get_line {
  83         $thisline = $lookahead;  # GLOBAL VARIABLES!!
  84         LINE: while ($lookahead = <STDIN>) {
  85             if ($lookahead =~ /^[ \t]/) {
  86                 $thisline .= $lookahead;
  87             }
  88             else {
  89                 last LINE;
  90             }
  91         }
  92         $thisline;
  93     }
  94
  95     $lookahead = <STDIN>;       # get first line
  96     while ($_ = get_line()) {
  97         ...
  98     }
  99
 100 Use array assignment to a local list to name your formal arguments:
 101
 102     sub maybeset {
 103         my($key, $value) = @_;
 104         $Foo{$key} = $value unless $Foo{$key};
 105     }
 106
 107 This also has the effect of turning call-by-reference into call-by-value,
 108 because the assignment copies the values.  Otherwise a function is free to
 109 do in-place modifications of @_ and change its caller's values.
 110
 111     upcase_in($v1, $v2);  # this changes $v1 and $v2
 112     sub upcase_in {
 113         for (@_) { tr/a-z/A-Z/ }
 114     }
 115
 116 You aren't allowed to modify constants in this way, of course.  If an
 117 argument were actually literal and you tried to change it, you'd take a
 118 (presumably fatal) exception.   For example, this won't work:
 119
 120     upcase_in("frederick");
 121
 122 It would be much safer if the upcase_in() function
 123 were written to return a copy of its parameters instead
 124 of changing them in place:
 125
 126     ($v3, $v4) = upcase($v1, $v2);  # this doesn't
 127     sub upcase {
 128         my @parms = @_;
 129         for (@parms) { tr/a-z/A-Z/ }
 130         # wantarray checks if we were called in list context
 131         return wantarray ? @parms : $parms[0];
 132     }
 133
 134 Notice how this (unprototyped) function doesn't care whether it was passed
 135 real scalars or arrays.  Perl will see everything as one big long flat @_
 136 parameter list.  This is one of the ways where Perl's simple
 137 argument-passing style shines.  The upcase() function would work perfectly
 138 well without changing the upcase() definition even if we fed it things
 139 like this:
 140
 141     @newlist   = upcase(@list1, @list2);
 142     @newlist   = upcase( split /:/, $var );
 143
 144 Do not, however, be tempted to do this:
 145
 146     (@a, @b)   = upcase(@list1, @list2);
 147
 148 Because like its flat incoming parameter list, the return list is also
 149 flat.  So all you have managed to do here is stored everything in @a and
 150 made @b an empty list.  See L</"Pass by Reference"> for alternatives.
 151
 152 A subroutine may be called using the "&" prefix.  The "&" is optional
 153 in modern Perls, and so are the parentheses if the subroutine has been
 154 pre-declared.  (Note, however, that the "&" is I<NOT> optional when
 155 you're just naming the subroutine, such as when it's used as an
 156 argument to defined() or undef().  Nor is it optional when you want to
 157 do an indirect subroutine call with a subroutine name or reference
 158 using the C<&$subref()> or C<&{$subref}()> constructs.  See L<perlref>
 159 for more on that.)
 160
 161 Subroutines may be called recursively.  If a subroutine is called using
 162 the "&" form, the argument list is optional, and if omitted, no @_ array is
 163 set up for the subroutine: the @_ array at the time of the call is
 164 visible to subroutine instead.  This is an efficiency mechanism that
 165 new users may wish to avoid.
 166
 167     &foo(1,2,3);        # pass three arguments
 168     foo(1,2,3);         # the same
 169
 170     foo();              # pass a null list
 171     &foo();             # the same
 172
 173     &foo;               # foo() get current args, like foo(@_) !!
 174     foo;                # like foo() IFF sub foo pre-declared, else "foo"
 175
 176 Not only does the "&" form make the argument list optional, but it also
 177 disables any prototype checking on the arguments you do provide.  This
 178 is partly for historical reasons, and partly for having a convenient way
 179 to cheat if you know what you're doing.  See the section on Prototypes below.
 180
 181 =head2 Private Variables via my()
 182
 183 Synopsis:
 184
 185     my $foo;            # declare $foo lexically local
 186     my (@wid, %get);    # declare list of variables local
 187     my $foo = "flurp";  # declare $foo lexical, and init it
 188     my @oof = @bar;     # declare @oof lexical, and init it
 189
 190 A "my" declares the listed variables to be confined (lexically) to the
 191 enclosing block, conditional (C<if/unless/elsif/else>), loop
 192 (C<for/foreach/while/until/continue>), subroutine, C<eval>, or
 193 C<do/require/use>'d file.  If more than one value is listed, the list
 194 must be placed in parentheses.  All listed elements must be legal lvalues.
 195 Only alphanumeric identifiers may be lexically scoped--magical
 196 builtins like $/ must currently be localized with "local" instead.
 197
 198 Unlike dynamic variables created by the "local" statement, lexical
 199 variables declared with "my" are totally hidden from the outside world,
 200 including any called subroutines (even if it's the same subroutine called
 201 from itself or elsewhere--every call gets its own copy).
 202
 203 (An eval(), however, can see the lexical variables of the scope it is
 204 being evaluated in so long as the names aren't hidden by declarations within
 205 the eval() itself.  See L<perlref>.)
 206
 207 The parameter list to my() may be assigned to if desired, which allows you
 208 to initialize your variables.  (If no initializer is given for a
 209 particular variable, it is created with the undefined value.)  Commonly
 210 this is used to name the parameters to a subroutine.  Examples:
 211
 212     $arg = "fred";        # "global" variable
 213     $n = cube_root(27);
 214     print "$arg thinks the root is $n\n";
 215  fred thinks the root is 3
 216
 217     sub cube_root {
 218         my $arg = shift;  # name doesn't matter
 219         $arg **= 1/3;
 220         return $arg;
 221     }
 222
 223 The "my" is simply a modifier on something you might assign to.  So when
 224 you do assign to the variables in its argument list, the "my" doesn't
 225 change whether those variables is viewed as a scalar or an array.  So
 226
 227     my ($foo) = <STDIN>;
 228     my @FOO = <STDIN>;
 229
 230 both supply a list context to the right-hand side, while
 231
 232     my $foo = <STDIN>;
 233
 234 supplies a scalar context.  But the following declares only one variable:
 235
 236     my $foo, $bar = 1;
 237
 238 That has the same effect as
 239
 240     my $foo;
 241     $bar = 1;
 242
 243 The declared variable is not introduced (is not visible) until after
 244 the current statement.  Thus,
 245
 246     my $x = $x;
 247
 248 can be used to initialize the new $x with the value of the old $x, and
 249 the expression
 250
 251     my $x = 123 and $x == 123
 252
 253 is false unless the old $x happened to have the value 123.
 254
 255 Lexical scopes of control structures are not bounded precisely by the
 256 braces that delimit their controlled blocks; control expressions are
 257 part of the scope, too.  Thus in the loop
 258
 259     while (my $line = <>) {
 260         $line = lc $line;
 261     } continue {
 262         print $line;
 263     }
 264
 265 the scope of $line extends from its declaration throughout the rest of
 266 the loop construct (including the C<continue> clause), but not beyond
 267 it.  Similarly, in the conditional
 268
 269     if ((my $answer = <STDIN>) =~ /^yes$/i) {
 270         user_agrees();
 271     } elsif ($answer =~ /^no$/i) {
 272         user_disagrees();
 273     } else {
 274         chomp $answer;
 275         die "'$answer' is neither 'yes' nor 'no'";
 276     }
 277
 278 the scope of $answer extends from its declaration throughout the rest
 279 of the conditional (including C<elsif> and C<else> clauses, if any),
 280 but not beyond it.
 281
 282 (None of the foregoing applies to C<if/unless> or C<while/until>
 283 modifiers appended to simple statements.  Such modifiers are not
 284 control structures and have no effect on scoping.)
 285
 286 The C<foreach> loop defaults to scoping its index variable dynamically
 287 (in the manner of C<local>; see below).  However, if the index
 288 variable is prefixed with the keyword "my", then it is lexically
 289 scoped instead.  Thus in the loop
 290
 291     for my $i (1, 2, 3) {
 292         some_function();
 293     }
 294
 295 the scope of $i extends to the end of the loop, but not beyond it, and
 296 so the value of $i is unavailable in some_function().
 297
 298 Some users may wish to encourage the use of lexically scoped variables.
 299 As an aid to catching implicit references to package variables,
 300 if you say
 301
 302     use strict 'vars';
 303
 304 then any variable reference from there to the end of the enclosing
 305 block must either refer to a lexical variable, or must be fully
 306 qualified with the package name.  A compilation error results
 307 otherwise.  An inner block may countermand this with S<"no strict 'vars'">.
 308
 309 A my() has both a compile-time and a run-time effect.  At compile time,
 310 the compiler takes notice of it; the principle usefulness of this is to
 311 quiet C<use strict 'vars'>.  The actual initialization doesn't happen
 312 until run time, so gets executed every time through a loop.
 313
 314 Variables declared with "my" are not part of any package and are therefore
 315 never fully qualified with the package name.  In particular, you're not
 316 allowed to try to make a package variable (or other global) lexical:
 317
 318     my $pack::var;      # ERROR!  Illegal syntax
 319     my $_;              # also illegal (currently)
 320
 321 In fact, a dynamic variable (also known as package or global variables)
 322 are still accessible using the fully qualified :: notation even while a
 323 lexical of the same name is also visible:
 324
 325     package main;
 326     local $x = 10;
 327     my    $x = 20;
 328     print "$x and $::x\n";
 329
 330 That will print out 20 and 10.
 331
 332 You may declare "my" variables at the outermost scope of a file to
 333 hide any such identifiers totally from the outside world.  This is similar
 334 to C's static variables at the file level.  To do this with a subroutine
 335 requires the use of a closure (anonymous function).  If a block (such as
 336 an eval(), function, or C<package>) wants to create a private subroutine
 337 that cannot be called from outside that block, it can declare a lexical
 338 variable containing an anonymous sub reference:
 339
 340     my $secret_version = '1.001-beta';
 341     my $secret_sub = sub { print $secret_version };
 342     &$secret_sub();
 343
 344 As long as the reference is never returned by any function within the
 345 module, no outside module can see the subroutine, because its name is not in
 346 any package's symbol table.  Remember that it's not I<REALLY> called
 347 $some_pack::secret_version or anything; it's just $secret_version,
 348 unqualified and unqualifiable.
 349
 350 This does not work with object methods, however; all object methods have
 351 to be in the symbol table of some package to be found.
 352
 353 Just because the lexical variable is lexically (also called statically)
 354 scoped doesn't mean that within a function it works like a C static.  It
 355 normally works more like a C auto.  But here's a mechanism for giving a
 356 function private variables with both lexical scoping and a static
 357 lifetime.  If you do want to create something like C's static variables,
 358 just enclose the whole function in an extra block, and put the
 359 static variable outside the function but in the block.
 360
 361     {
 362         my $secret_val = 0;
 363         sub gimme_another {
 364             return ++$secret_val;
 365         }
 366     }
 367     # $secret_val now becomes unreachable by the outside
 368     # world, but retains its value between calls to gimme_another
 369
 370 If this function is being sourced in from a separate file
 371 via C<require> or C<use>, then this is probably just fine.  If it's
 372 all in the main program, you'll need to arrange for the my()
 373 to be executed early, either by putting the whole block above
 374 your pain program, or more likely, placing merely a BEGIN
 375 sub around it to make sure it gets executed before your program
 376 starts to run:
 377
 378     sub BEGIN {
 379         my $secret_val = 0;
 380         sub gimme_another {
 381             return ++$secret_val;
 382         }
 383     }
 384
 385 See L<perlrun> about the BEGIN function.
 386
 387 =head2 Temporary Values via local()
 388
 389 B<NOTE>: In general, you should be using "my" instead of "local", because
 390 it's faster and safer.  Exceptions to this include the global punctuation
 391 variables, filehandles and formats, and direct manipulation of the Perl
 392 symbol table itself.  Format variables often use "local" though, as do
 393 other variables whose current value must be visible to called
 394 subroutines.
 395
 396 Synopsis:
 397
 398     local $foo;                 # declare $foo dynamically local
 399     local (@wid, %get);         # declare list of variables local
 400     local $foo = "flurp";       # declare $foo dynamic, and init it
 401     local @oof = @bar;          # declare @oof dynamic, and init it
 402
 403     local *FH;                  # localize $FH, @FH, %FH, &FH  ...
 404     local *merlyn = *randal;    # now $merlyn is really $randal, plus
 405                                 #     @merlyn is really @randal, etc
 406     local *merlyn = 'randal';   # SAME THING: promote 'randal' to *randal
 407     local *merlyn = \$randal;   # just alias $merlyn, not @merlyn etc
 408
 409 A local() modifies its listed variables to be local to the enclosing
 410 block, (or subroutine, C<eval{}>, or C<do>) and I<any called from
 411 within that block>.  A local() just gives temporary values to global
 412 (meaning package) variables.  This is known as dynamic scoping.  Lexical
 413 scoping is done with "my", which works more like C's auto declarations.
 414
 415 If more than one variable is given to local(), they must be placed in
 416 parentheses.  All listed elements must be legal lvalues.  This operator works
 417 by saving the current values of those variables in its argument list on a
 418 hidden stack and restoring them upon exiting the block, subroutine, or
 419 eval.  This means that called subroutines can also reference the local
 420 variable, but not the global one.  The argument list may be assigned to if
 421 desired, which allows you to initialize your local variables.  (If no
 422 initializer is given for a particular variable, it is created with an
 423 undefined value.)  Commonly this is used to name the parameters to a
 424 subroutine.  Examples:
 425
 426     for $i ( 0 .. 9 ) {
 427         $digits{$i} = $i;
 428     }
 429     # assume this function uses global %digits hash
 430     parse_num();
 431
 432     # now temporarily add to %digits hash
 433     if ($base12) {
 434         # (NOTE: not claiming this is efficient!)
 435         local %digits  = (%digits, 't' => 10, 'e' => 11);
 436         parse_num();  # parse_num gets this new %digits!
 437     }
 438     # old %digits restored here
 439
 440 Because local() is a run-time command, it gets executed every time
 441 through a loop.  In releases of Perl previous to 5.0, this used more stack
 442 storage each time until the loop was exited.  Perl now reclaims the space
 443 each time through, but it's still more efficient to declare your variables
 444 outside the loop.
 445
 446 A local is simply a modifier on an lvalue expression.  When you assign to
 447 a localized variable, the local doesn't change whether its list is viewed
 448 as a scalar or an array.  So
 449
 450     local($foo) = <STDIN>;
 451     local @FOO = <STDIN>;
 452
 453 both supply a list context to the right-hand side, while
 454
 455     local $foo = <STDIN>;
 456
 457 supplies a scalar context.
 458
 459 =head2 Passing Symbol Table Entries (typeglobs)
 460
 461 [Note:  The mechanism described in this section was originally the only
 462 way to simulate pass-by-reference in older versions of Perl.  While it
 463 still works fine in modern versions, the new reference mechanism is
 464 generally easier to work with.  See below.]
 465
 466 Sometimes you don't want to pass the value of an array to a subroutine
 467 but rather the name of it, so that the subroutine can modify the global
 468 copy of it rather than working with a local copy.  In perl you can
 469 refer to all objects of a particular name by prefixing the name
 470 with a star: C<*foo>.  This is often known as a "typeglob", because the
 471 star on the front can be thought of as a wildcard match for all the
 472 funny prefix characters on variables and subroutines and such.
 473
 474 When evaluated, the typeglob produces a scalar value that represents
 475 all the objects of that name, including any filehandle, format, or
 476 subroutine.  When assigned to, it causes the name mentioned to refer to
 477 whatever "*" value was assigned to it.  Example:
 478
 479     sub doubleary {
 480         local(*someary) = @_;
 481         foreach $elem (@someary) {
 482             $elem *= 2;
 483         }
 484     }
 485     doubleary(*foo);
 486     doubleary(*bar);
 487
 488 Note that scalars are already passed by reference, so you can modify
 489 scalar arguments without using this mechanism by referring explicitly
 490 to C<$_[0]> etc.  You can modify all the elements of an array by passing
 491 all the elements as scalars, but you have to use the * mechanism (or
 492 the equivalent reference mechanism) to push, pop, or change the size of
 493 an array.  It will certainly be faster to pass the typeglob (or reference).
 494
 495 Even if you don't want to modify an array, this mechanism is useful for
 496 passing multiple arrays in a single LIST, because normally the LIST
 497 mechanism will merge all the array values so that you can't extract out
 498 the individual arrays.  For more on typeglobs, see
 499 L<perldata/"Typeglobs and FileHandles">.
 500
 501 =head2 Pass by Reference
 502
 503 If you want to pass more than one array or hash into a function--or
 504 return them from it--and have them maintain their integrity, then
 505 you're going to have to use an explicit pass-by-reference.  Before you
 506 do that, you need to understand references as detailed in L<perlref>.
 507 This section may not make much sense to you otherwise.
 508
 509 Here are a few simple examples.  First, let's pass in several
 510 arrays to a function and have it pop all of then, return a new
 511 list of all their former last elements:
 512
 513     @tailings = popmany ( \@a, \@b, \@c, \@d );
 514
 515     sub popmany {
 516         my $aref;
 517         my @retlist = ();
 518         foreach $aref ( @_ ) {
 519             push @retlist, pop @$aref;
 520         }
 521         return @retlist;
 522     }
 523
 524 Here's how you might write a function that returns a
 525 list of keys occurring in all the hashes passed to it:
 526
 527     @common = inter( \%foo, \%bar, \%joe );
 528     sub inter {
 529         my ($k, $href, %seen); # locals
 530         foreach $href (@_) {
 531             while ( $k = each %$href ) {
 532                 $seen{$k}++;
 533             }
 534         }
 535         return grep { $seen{$_} == @_ } keys %seen;
 536     }
 537
 538 So far, we're using just the normal list return mechanism.
 539 What happens if you want to pass or return a hash?  Well,
 540 if you're using only one of them, or you don't mind them
 541 concatenating, then the normal calling convention is ok, although
 542 a little expensive.
 543
 544 Where people get into trouble is here:
 545
 546     (@a, @b) = func(@c, @d);
 547 or
 548     (%a, %b) = func(%c, %d);
 549
 550 That syntax simply won't work.  It sets just @a or %a and clears the @b or
 551 %b.  Plus the function didn't get passed into two separate arrays or
 552 hashes: it got one long list in @_, as always.
 553
 554 If you can arrange for everyone to deal with this through references, it's
 555 cleaner code, although not so nice to look at.  Here's a function that
 556 takes two array references as arguments, returning the two array elements
 557 in order of how many elements they have in them:
 558
 559     ($aref, $bref) = func(\@c, \@d);
 560     print "@$aref has more than @$bref\n";
 561     sub func {
 562         my ($cref, $dref) = @_;
 563         if (@$cref > @$dref) {
 564             return ($cref, $dref);
 565         } else {
 566             return ($dref, $cref);
 567         }
 568     }
 569
 570 It turns out that you can actually do this also:
 571
 572     (*a, *b) = func(\@c, \@d);
 573     print "@a has more than @b\n";
 574     sub func {
 575         local (*c, *d) = @_;
 576         if (@c > @d) {
 577             return (\@c, \@d);
 578         } else {
 579             return (\@d, \@c);
 580         }
 581     }
 582
 583 Here we're using the typeglobs to do symbol table aliasing.  It's
 584 a tad subtle, though, and also won't work if you're using my()
 585 variables, because only globals (well, and local()s) are in the symbol table.
 586
 587 If you're passing around filehandles, you could usually just use the bare
 588 typeglob, like *STDOUT, but typeglobs references would be better because
 589 they'll still work properly under C<use strict 'refs'>.  For example:
 590
 591     splutter(\*STDOUT);
 592     sub splutter {
 593         my $fh = shift;
 594         print $fh "her um well a hmmm\n";
 595     }
 596
 597     $rec = get_rec(\*STDIN);
 598     sub get_rec {
 599         my $fh = shift;
 600         return scalar <$fh>;
 601     }
 602
 603 Another way to do this is using *HANDLE{IO}, see L<perlref> for usage
 604 and caveats.
 605
 606 If you're planning on generating new filehandles, you could do this:
 607
 608     sub openit {
 609         my $name = shift;
 610         local *FH;
 611         return open (FH, $path) ? \*FH : undef;
 612     }
 613
 614 Although that will actually produce a small memory leak.  See the bottom
 615 of L<perlfunc/open()> for a somewhat cleaner way using the IO::Handle
 616 package.
 617
 618 =head2 Prototypes
 619
 620 As of the 5.002 release of perl, if you declare
 621
 622     sub mypush (\@@)
 623
 624 then mypush() takes arguments exactly like push() does.  The declaration
 625 of the function to be called must be visible at compile time.  The prototype
 626 affects only the interpretation of new-style calls to the function, where
 627 new-style is defined as not using the C<&> character.  In other words,
 628 if you call it like a builtin function, then it behaves like a builtin
 629 function.  If you call it like an old-fashioned subroutine, then it
 630 behaves like an old-fashioned subroutine.  It naturally falls out from
 631 this rule that prototypes have no influence on subroutine references
 632 like C<\&foo> or on indirect subroutine calls like C<&{$subref}>.
 633
 634 Method calls are not influenced by prototypes either, because the
 635 function to be called is indeterminate at compile time, because it depends
 636 on inheritance.
 637
 638 Because the intent is primarily to let you define subroutines that work
 639 like builtin commands, here are the prototypes for some other functions
 640 that parse almost exactly like the corresponding builtins.
 641
 642     Declared as                 Called as
 643
 644     sub mylink ($$)             mylink $old, $new
 645     sub myvec ($$$)             myvec $var, $offset, 1
 646     sub myindex ($$;$)          myindex &getstring, "substr"
 647     sub mysyswrite ($$$;$)      mysyswrite $buf, 0, length($buf) - $off, $off
 648     sub myreverse (@)           myreverse $a,$b,$c
 649     sub myjoin ($@)             myjoin ":",$a,$b,$c
 650     sub mypop (\@)              mypop @array
 651     sub mysplice (\@$$@)        mysplice @array,@array,0,@pushme
 652     sub mykeys (\%)             mykeys %{$hashref}
 653     sub myopen (*;$)            myopen HANDLE, $name
 654     sub mypipe (**)             mypipe READHANDLE, WRITEHANDLE
 655     sub mygrep (&@)             mygrep { /foo/ } $a,$b,$c
 656     sub myrand ($)              myrand 42
 657     sub mytime ()               mytime
 658
 659 Any backslashed prototype character represents an actual argument
 660 that absolutely must start with that character.  The value passed
 661 to the subroutine (as part of C<@_>) will be a reference to the
 662 actual argument given in the subroutine call, obtained by applying
 663 C<\> to that argument.
 664
 665 Unbackslashed prototype characters have special meanings.  Any
 666 unbackslashed @ or % eats all the rest of the arguments, and forces
 667 list context.  An argument represented by $ forces scalar context.  An
 668 & requires an anonymous subroutine, which, if passed as the first
 669 argument, does not require the "sub" keyword or a subsequent comma.  A
 670 * does whatever it has to do to turn the argument into a reference to a
 671 symbol table entry.
 672
 673 A semicolon separates mandatory arguments from optional arguments.
 674 (It is redundant before @ or %.)
 675
 676 Note how the last three examples above are treated specially by the parser.
 677 mygrep() is parsed as a true list operator, myrand() is parsed as a
 678 true unary operator with unary precedence the same as rand(), and
 679 mytime() is truly without arguments, just like time().  That is, if you
 680 say
 681
 682     mytime +2;
 683
 684 you'll get mytime() + 2, not mytime(2), which is how it would be parsed
 685 without the prototype.
 686
 687 The interesting thing about & is that you can generate new syntax with it:
 688
 689     sub try (&@) {
 690         my($try,$catch) = @_;
 691         eval { &$try };
 692         if ($@) {
 693             local $_ = $@;
 694             &$catch;
 695         }
 696     }
 697     sub catch (&) { $_[0] }
 698
 699     try {
 700         die "phooey";
 701     } catch {
 702         /phooey/ and print "unphooey\n";
 703     };
 704
 705 That prints "unphooey".  (Yes, there are still unresolved
 706 issues having to do with the visibility of @_.  I'm ignoring that
 707 question for the moment.  (But note that if we make @_ lexically
 708 scoped, those anonymous subroutines can act like closures... (Gee,
 709 is this sounding a little Lispish?  (Never mind.))))
 710
 711 And here's a reimplementation of grep:
 712
 713     sub mygrep (&@) {
 714         my $code = shift;
 715         my @result;
 716         foreach $_ (@_) {
 717             push(@result, $_) if &$code;
 718         }
 719         @result;
 720     }
 721
 722 Some folks would prefer full alphanumeric prototypes.  Alphanumerics have
 723 been intentionally left out of prototypes for the express purpose of
 724 someday in the future adding named, formal parameters.  The current
 725 mechanism's main goal is to let module writers provide better diagnostics
 726 for module users.  Larry feels the notation quite understandable to Perl
 727 programmers, and that it will not intrude greatly upon the meat of the
 728 module, nor make it harder to read.  The line noise is visually
 729 encapsulated into a small pill that's easy to swallow.
 730
 731 It's probably best to prototype new functions, not retrofit prototyping
 732 into older ones.  That's because you must be especially careful about
 733 silent impositions of differing list versus scalar contexts.  For example,
 734 if you decide that a function should take just one parameter, like this:
 735
 736     sub func ($) {
 737         my $n = shift;
 738         print "you gave me $n\n";
 739     }
 740
 741 and someone has been calling it with an array or expression
 742 returning a list:
 743
 744     func(@foo);
 745     func( split /:/ );
 746
 747 Then you've just supplied an automatic scalar() in front of their
 748 argument, which can be more than a bit surprising.  The old @foo
 749 which used to hold one thing doesn't get passed in.  Instead,
 750 the func() now gets passed in 1, that is, the number of elements
 751 in @foo.  And the split() gets called in a scalar context and
 752 starts scribbling on your @_ parameter list.
 753
 754 This is all very powerful, of course, and should be used only in moderation
 755 to make the world a better place.
 756
 757 If you redefine a subroutine which was eligible for inlining you'll get
 758 a mandatory warning.  (You can use this warning to tell whether or not a
 759 particular subroutine is considered constant.)  The warning is
 760 considered severe enough not to be optional because previously compiled
 761 invocations of the function will still be using the old value of the
 762 function.  If you need to be able to redefine the subroutine you need to
 763 ensure that it isn't inlined, either by dropping the C<()> prototype
 764 (which changes the calling semantics, so beware) or by thwarting the
 765 inlining mechanism in some other way, such as
 766
 767     my $dummy;
 768     sub not_inlined () {
 769         $dummy || 23
 770     }
 771
 772 =head2 Overriding Builtin Functions
 773
 774 Many builtin functions may be overridden, though this should be tried
 775 only occasionally and for good reason.  Typically this might be
 776 done by a package attempting to emulate missing builtin functionality
 777 on a non-Unix system.
 778
 779 Overriding may be done only by importing the name from a
 780 module--ordinary predeclaration isn't good enough.  However, the
 781 C<subs> pragma (compiler directive) lets you, in effect, pre-declare subs
 782 via the import syntax, and these names may then override the builtin ones:
 783
 784     use subs 'chdir', 'chroot', 'chmod', 'chown';
 785     chdir $somewhere;
 786     sub chdir { ... }
 787
 788 Library modules should not in general export builtin names like "open"
 789 or "chdir" as part of their default @EXPORT list, because these may
 790 sneak into someone else's namespace and change the semantics unexpectedly.
 791 Instead, if the module adds the name to the @EXPORT_OK list, then it's
 792 possible for a user to import the name explicitly, but not implicitly.
 793 That is, they could say
 794
 795     use Module 'open';
 796
 797 and it would import the open override, but if they said
 798
 799     use Module;
 800
 801 they would get the default imports without the overrides.
 802
 803 =head2 Autoloading
 804
 805 If you call a subroutine that is undefined, you would ordinarily get an
 806 immediate fatal error complaining that the subroutine doesn't exist.
 807 (Likewise for subroutines being used as methods, when the method
 808 doesn't exist in any of the base classes of the class package.) If,
 809 however, there is an C<AUTOLOAD> subroutine defined in the package or
 810 packages that were searched for the original subroutine, then that
 811 C<AUTOLOAD> subroutine is called with the arguments that would have been
 812 passed to the original subroutine.  The fully qualified name of the
 813 original subroutine magically appears in the $AUTOLOAD variable in the
 814 same package as the C<AUTOLOAD> routine.  The name is not passed as an
 815 ordinary argument because, er, well, just because, that's why...
 816
 817 Most C<AUTOLOAD> routines will load in a definition for the subroutine in
 818 question using eval, and then execute that subroutine using a special
 819 form of "goto" that erases the stack frame of the C<AUTOLOAD> routine
 820 without a trace.  (See the standard C<AutoLoader> module, for example.)
 821 But an C<AUTOLOAD> routine can also just emulate the routine and never
 822 define it.   For example, let's pretend that a function that wasn't defined
 823 should just call system() with those arguments.  All you'd do is this:
 824
 825     sub AUTOLOAD {
 826         my $program = $AUTOLOAD;
 827         $program =~ s/.*:://;
 828         system($program, @_);
 829     }
 830     date();
 831     who('am', 'i');
 832     ls('-l');
 833
 834 In fact, if you pre-declare the functions you want to call that way, you don't
 835 even need the parentheses:
 836
 837     use subs qw(date who ls);
 838     date;
 839     who "am", "i";
 840     ls -l;
 841
 842 A more complete example of this is the standard Shell module, which
 843 can treat undefined subroutine calls as calls to Unix programs.
 844
 845 Mechanisms are available for modules writers to help split the modules
 846 up into autoloadable files.  See the standard AutoLoader module
 847 described in L<AutoLoader> and in L<AutoSplit>, the standard
 848 SelfLoader modules in L<SelfLoader>, and the document on adding C
 849 functions to perl code in L<perlxs>.
 850
 851 =head1 SEE ALSO
 852
 853 See L<perlref> for more on references.  See L<perlxs> if you'd
 854 like to learn about calling C subroutines from perl.  See
 855 L<perlmod> to learn about bundling up your functions in
 856 separate files.