=head1 NAME
+X<subroutine> X<function>
perlsub - Perl subroutines
=head1 SYNOPSIS
To declare subroutines:
+X<subroutine, declaration> X<sub>
sub NAME; # A "forward" declaration.
sub NAME(PROTO); # ditto, but with prototypes
sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
To define an anonymous subroutine at runtime:
+X<subroutine, anonymous>
$subref = sub BLOCK; # no proto
$subref = sub (PROTO) BLOCK; # with proto
$subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
To import subroutines:
+X<import>
use MODULE qw(NAME1 NAME2 NAME3);
To call subroutines:
+X<subroutine, call> X<call>
NAME(LIST); # & is optional with parentheses.
NAME LIST; # Parentheses optional if predeclared/imported.
contain as many or as few scalar elements as you'd like. (Often a
function without an explicit return statement is called a subroutine, but
there's really no difference from Perl's perspective.)
+X<subroutine, parameter> X<parameter>
Any arguments passed in show up in the array C<@_>. Therefore, if
you called a function with two arguments, those would be stored in
created the element whether or not the element was assigned to.)
Assigning to the whole array C<@_> removes that aliasing, and does
not update any arguments.
-
-The return value of a subroutine is the value of the last expression
-evaluated by that sub, or the empty list in the case of an empty sub.
-More explicitly, a C<return> statement may be used to exit the
-subroutine, optionally specifying the returned value, which will be
-evaluated in the appropriate context (list, scalar, or void) depending
-on the context of the subroutine call. If you specify no return value,
-the subroutine returns an empty list in list context, the undefined
-value in scalar context, or nothing in void context. If you return
-one or more aggregates (arrays and hashes), these will be flattened
-together into one large indistinguishable list.
+X<subroutine, argument> X<argument> X<@_>
+
+A C<return> statement may be used to exit a subroutine, optionally
+specifying the returned value, which will be evaluated in the
+appropriate context (list, scalar, or void) depending on the context of
+the subroutine call. If you specify no return value, the subroutine
+returns an empty list in list context, the undefined value in scalar
+context, or nothing in void context. If you return one or more
+aggregates (arrays and hashes), these will be flattened together into
+one large indistinguishable list.
+
+If no C<return> is found and if the last statement is an expression, its
+value is returned. If the last statement is a loop control structure
+like a C<foreach> or a C<while>, the returned value is unspecified. The
+empty sub returns the empty list.
+X<subroutine, return value> X<return value> X<return>
Perl does not have named formal parameters. In practice all you
do is assign to a C<my()> list of these. Variables that aren't
and L<"Temporary Values via local()">. To create protected
environments for a set of functions in a separate package (and
probably a separate file), see L<perlmod/"Packages">.
+X<formal parameter> X<parameter, formal>
Example:
of turning call-by-reference into call-by-value. Otherwise a
function is free to do in-place modifications of C<@_> and change
its caller's values.
+X<call-by-reference> X<call-by-value>
upcase_in($v1, $v2); # this changes $v1 and $v2
sub upcase_in {
You aren't allowed to modify constants in this way, of course. If an
argument were actually literal and you tried to change it, you'd take a
(presumably fatal) exception. For example, this won't work:
+X<call-by-reference> X<call-by-value>
upcase_in("frederick");
reference using the C<&$subref()> or C<&{$subref}()> constructs,
although the C<< $subref->() >> notation solves that problem.
See L<perlref> for more about all that.
+X<&>
Subroutines may be called recursively. If a subroutine is called
using the C<&> form, the argument list is optional, and if omitted,
no C<@_> array is set up for the subroutine: the C<@_> array at the
time of the call is visible to subroutine instead. This is an
efficiency mechanism that new users may wish to avoid.
+X<recursion>
&foo(1,2,3); # pass three arguments
foo(1,2,3); # the same
Not only does the C<&> form make the argument list optional, it also
disables any prototype checking on arguments you do provide. This
is partly for historical reasons, and partly for having a convenient way
-to cheat if you know what you're doing. See L<Prototypes> below.
+to cheat if you know what you're doing. See L</Prototypes> below.
+X<&>
Subroutines whose names are in all upper case are reserved to the Perl
core, as are modules whose names are in all lower case. A subroutine in
Subroutines that do special, pre-defined things include C<AUTOLOAD>, C<CLONE>,
C<DESTROY> plus all functions mentioned in L<perltie> and L<PerlIO::via>.
-The C<BEGIN>, C<CHECK>, C<INIT> and C<END> subroutines are not so much
-subroutines as named special code blocks, of which you can have more
-than one in a package, and which you can B<not> call explicitely. See
-L<perlmod/"BEGIN, CHECK, INIT and END">
+The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
+are not so much subroutines as named special code blocks, of which you
+can have more than one in a package, and which you can B<not> call
+explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">
=head2 Private Variables via my()
+X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
+X<lexical scope> X<attributes, my>
Synopsis:
world, including any called subroutines. This is true if it's the
same subroutine called from itself or elsewhere--every call gets
its own copy.
+X<local>
This doesn't mean that a C<my> variable declared in a statically
enclosing lexical scope would be invisible. Only dynamic scopes
An C<eval()>, however, can see lexical variables of the scope it is
being evaluated in, so long as the names aren't hidden by declarations within
the C<eval()> itself. See L<perlref>.
+X<eval, scope of>
The parameter list to my() may be assigned to if desired, which allows you
to initialize your variables. (If no initializer is given for a
the scope of $answer extends from its declaration through the rest
of that conditional, including any C<elsif> and C<else> clauses,
-but not beyond it. See L<perlsyn/"Simple statements"> for information
+but not beyond it. See L<perlsyn/"Simple Statements"> for information
on the scope of variables in statements with modifiers.
The C<foreach> loop defaults to scoping its index variable dynamically
prefixed with the keyword C<my>, or if there is already a lexical
by that name in scope, then a new lexical is created instead. Thus
in the loop
+X<foreach> X<for>
for my $i (1, 2, 3) {
some_function();
the scope of $i extends to the end of the loop, but not beyond it,
rendering the value of $i inaccessible within C<some_function()>.
+X<foreach> X<for>
Some users may wish to encourage the use of lexically scoped variables.
As an aid to catching implicit uses to package variables,
allowed to try to make a package variable (or other global) lexical:
my $pack::var; # ERROR! Illegal syntax
- my $_; # also illegal (currently)
In fact, a dynamic variable (also known as package or global variables)
are still accessible using the fully qualified C<::> notation even while a
this.
=head2 Persistent Private Variables
+X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>
+
+There are two ways to build persistent private variables in Perl 5.10.
+First, you can simply use the C<state> feature. Or, you can use closures,
+if you want to stay compatible with releases older than 5.10.
+
+=head3 Persistent variables via state()
+
+Beginning with Perl 5.9.4, you can declare variables with the C<state>
+keyword in place of C<my>. For that to work, though, you must have
+enabled that feature beforehand, either by using the C<feature> pragma, or
+by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16,
+you can also write it as C<CORE::state>, which does not require the
+C<feature> pragma.
+
+For example, the following code maintains a private counter, incremented
+each time the gimme_another() function is called:
+
+ use feature 'state';
+ sub gimme_another { state $x; return ++$x }
+
+Also, since C<$x> is lexical, it can't be reached or modified by any Perl
+code outside.
+
+When combined with variable declaration, simple scalar assignment to C<state>
+variables (as in C<state $x = 42>) is executed only the first time. When such
+statements are evaluated subsequent times, the assignment is ignored. The
+behavior of this sort of assignment to non-scalar variables is undefined.
+
+=head3 Persistent variables with closures
Just because a lexical variable is lexically (also called statically)
scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
}
}
-See L<perlmod/"BEGIN, CHECK, INIT and END"> about the
-special triggered code blocks, C<BEGIN>, C<CHECK>, C<INIT> and C<END>.
+See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
+special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
+C<INIT> and C<END>.
If declared at the outermost scope (the file scope), then lexicals
work somewhat like C's file statics. They are available to all
to create private variables that the whole module can see.
=head2 Temporary Values via local()
+X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
+X<variable, temporary>
B<WARNING>: In general, you should be using C<my> instead of C<local>, because
it's faster and safer. Exceptions to this include the global punctuation
local @oof = @bar; # make @oof dynamic, and init it
local $hash{key} = "val"; # sets a local value for this hash entry
+ delete local $hash{key}; # delete this entry for the current block
local ($cond ? $v1 : $v2); # several types of lvalues support
# localization
a local variable. This is known as dynamic scoping. Lexical scoping
is done with C<my>, which works more like C's auto declarations.
-Some types of lvalues can be localized as well : hash and array elements
+Some types of lvalues can be localized as well: hash and array elements
and slices, conditionals (provided that their result is always
localizable), and symbolic references. As for simple variables, this
creates new, dynamically scoped values.
variables outside the loop.
=head3 Grammatical note on local()
+X<local, context>
A C<local> is simply a modifier on an lvalue expression. When you assign to
a C<local>ized variable, the C<local> doesn't change whether its list is viewed
supplies a scalar context.
=head3 Localization of special variables
+X<local, special variable>
If you localize a special variable, you'll be giving a new value to it,
but its magic won't go away. That means that all side-effects related
local $1 = 2;
-Similarly, but in a way more difficult to spot, the following snippet will
-die in perl 5.9.0 :
-
- sub f { local $_ = "foo"; print }
- for ($1) {
- # now $_ is aliased to $1, thus is magic and readonly
- f();
- }
-
-See next section for an alternative to this situation.
+One exception is the default scalar variable: starting with perl 5.14
+C<local($_)> will always strip all magic from $_, to make it possible
+to safely reuse $_ in a subroutine.
B<WARNING>: Localization of tied arrays and hashes does not currently
work as described.
or hashes (localising individual elements is still okay).
See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
details.
+X<local, tie>
=head3 Localization of globs
+X<local, glob> X<glob>
The construct
will not have any effect on the internal value of the input record
separator.
-Notably, if you want to work with a brand new value of the default scalar
-$_, and avoid the potential problem listed above about $_ previously
-carrying a magic value, you should use C<local *_> instead of C<local $_>.
-As of perl 5.9.1, you can also use the lexical form of C<$_> (declaring it
-with C<my $_>), which avoids completely this problem.
-
=head3 Localization of elements of composite types
+X<local, composite type element> X<local, array element> X<local, hash element>
It's also worth taking a moment to explain what happens when you
C<local>ize a member of a composite type (i.e. an array or hash element).
The behavior of local() on non-existent members of composite
types is subject to change in future.
+=head3 Localized deletion of elements of composite types
+X<delete> X<local, composite type element> X<local, array element> X<local, hash element>
+
+You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
+constructs to delete a composite type entry for the current block and restore
+it when it ends. They return the array/hash value before the localization,
+which means that they are respectively equivalent to
+
+ do {
+ my $val = $array[$idx];
+ local $array[$idx];
+ delete $array[$idx];
+ $val
+ }
+
+and
+
+ do {
+ my $val = $hash{key};
+ local $hash{key};
+ delete $hash{key};
+ $val
+ }
+
+except that for those the C<local> is scoped to the C<do> block. Slices are
+also accepted.
+
+ my %hash = (
+ a => [ 7, 8, 9 ],
+ b => 1,
+ )
+
+ {
+ my $a = delete local $hash{a};
+ # $a is [ 7, 8, 9 ]
+ # %hash is (b => 1)
+
+ {
+ my @nums = delete local @$a[0, 2]
+ # @nums is (7, 9)
+ # $a is [ undef, 8 ]
+
+ $a[0] = 999; # will be erased when the scope ends
+ }
+ # $a is back to [ 7, 8, 9 ]
+
+ }
+ # %hash is back to its original state
+
=head2 Lvalue subroutines
+X<lvalue> X<subroutine, lvalue>
B<WARNING>: Lvalue subroutines are still experimental and the
implementation may change in future versions of Perl.
my $val;
sub canmod : lvalue {
- # return $val; this doesn't work, don't say "return"
- $val;
+ $val; # or: return $val;
}
sub nomod {
$val;
=item Lvalue subroutines are EXPERIMENTAL
-They appear to be convenient, but there are several reasons to be
+They appear to be convenient, but there is at least one reason to be
circumspect.
-You can't use the return keyword, you must pass out the value before
-falling out of subroutine scope. (see comment in example above). This
-is usually not a problem, but it disallows an explicit return out of a
-deeply nested loop, which is sometimes a nice way out.
-
They violate encapsulation. A normal mutator can check the supplied
argument before setting the attribute it is protecting, an lvalue
subroutine never gets that chance. Consider;
=back
=head2 Passing Symbol Table Entries (typeglobs)
+X<typeglob> X<*>
B<WARNING>: The mechanism described in this section was originally
the only way to simulate pass-by-reference in older versions of
L<perldata/"Typeglobs and Filehandles">.
=head2 When to Still Use local()
+X<local> X<variable, local>
Despite the existence of C<my>, there are still three places where the
C<local> operator still shines. In fact, in these three places, you
a local alias.
{
- local *grow = \&shrink; # only until this block exists
+ local *grow = \&shrink; # only until this block exits
grow(); # really calls shrink()
move(); # if move() grow()s, it shrink()s too
}
=back
=head2 Pass by Reference
+X<pass by reference> X<pass-by-reference> X<reference>
If you want to pass more than one array or hash into a function--or
return them from it--and have them maintain their integrity, then
}
=head2 Prototypes
+X<prototype> X<subroutine, prototype>
Perl supports a very limited kind of compile-time argument checking
using function prototyping. If you declare
- sub mypush (\@@)
+ sub mypush (+@)
then C<mypush()> takes arguments exactly like C<push()> does. The
function declaration must be visible at compile time. The prototype
sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
sub myreverse (@) myreverse $a, $b, $c
sub myjoin ($@) myjoin ":", $a, $b, $c
- sub mypop (\@) mypop @array
- sub mysplice (\@$$@) mysplice @array, @array, 0, @pushme
- sub mykeys (\%) mykeys %{$hashref}
+ sub mypop (+) mypop @array
+ sub mysplice (+$$@) mysplice @array, 0, 2, @pushme
+ sub mykeys (+) mykeys %{$hashref}
sub myopen (*;$) myopen HANDLE, $name
sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
- sub myrand ($) myrand 42
+ sub myrand (;$) myrand 42
sub mytime () mytime
Any backslashed prototype character represents an actual argument
-that absolutely must start with that character. The value passed
-as part of C<@_> will be a reference to the actual argument given
-in the subroutine call, obtained by applying C<\> to that argument.
+that must start with that character (optionally preceded by C<my>,
+C<our> or C<local>), with the exception of C<$>, which will
+accept any scalar lvalue expression, such as C<$foo = 7> or
+C<< my_function()->[0] >>. The value passed as part of C<@_> will be a
+reference to the actual argument given in the subroutine call,
+obtained by applying C<\> to that argument.
-You can also backslash several argument types simultaneously by using
-the C<\[]> notation:
+You can use the C<\[]> backslash group notation to specify more than one
+allowed argument type. For example:
sub myref (\[$@%&*])
...
}
-A semicolon separates mandatory arguments from optional arguments.
+The C<+> prototype is a special alternative to C<$> that will act like
+C<\[@%]> when given a literal array or hash variable, but will otherwise
+force scalar context on the argument. This is useful for functions which
+should accept either a literal array or an array reference as the argument:
+
+ sub mypush (+@) {
+ my $aref = shift;
+ die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
+ push @$aref, @_;
+ }
+
+When using the C<+> prototype, your function must check that the argument
+is of an acceptable type.
+
+A semicolon (C<;>) separates mandatory arguments from optional arguments.
It is redundant before C<@> or C<%>, which gobble up everything else.
+As the last character of a prototype, or just before a semicolon, you can
+use C<_> in place of C<$>: if this argument is not provided, C<$_> will be
+used instead.
+
Note how the last three examples in the table above are treated
specially by the parser. C<mygrep()> is parsed as a true list
operator, C<myrand()> is parsed as a true unary operator with unary
mytime +2;
you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
-without a prototype.
+without a prototype. If you want to force a unary function to have the
+same precedence as a list operator, add C<;> to the end of the prototype:
+
+ sub mygetprotobynumber($;);
+ mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
The interesting thing about C<&> is that you can generate new syntax with it,
provided it's in the initial position:
+X<&>
sub try (&@) {
my($try,$catch) = @_;
is this sounding a little Lispish? (Never mind.))))
And here's a reimplementation of the Perl C<grep> operator:
+X<grep>
sub mygrep (&@) {
my $code = shift;
to make the world a better place.
=head2 Constant Functions
+X<constant>
Functions with a prototype of C<()> are potential candidates for
inlining. If the result after optimization and constant folding
}
=head2 Overriding Built-in Functions
+X<built-in> X<override> X<CORE> X<CORE::GLOBAL>
Many built-in functions may be overridden, though this should be tried
only occasionally and for good reason. Typically this might be
saying C<CORE::open()> always refers to the built-in C<open()>, even
if the current package has imported some other subroutine called
C<&open()> from elsewhere. Even though it looks like a regular
-function call, it isn't: you can't take a reference to it, such as
-the incorrect C<\&CORE::open> might appear to produce.
+function call, it isn't: the CORE:: prefix in that case is part of Perl's
+syntax, and works for any keyword, regardless of what is in the CORE
+package. Taking a reference to it, that is, C<\&CORE::open>, only works
+for some keywords. See L<CORE>.
Library modules should not in general export built-in names like C<open>
or C<chdir> as part of their default C<@EXPORT> list, because these may
sub glob {
my $pat = shift;
my @got;
- local *D;
- if (opendir D, '.') {
- @got = grep /$pat/, readdir D;
- closedir D;
+ if (opendir my $d, '.') {
+ @got = grep /$pat/, readdir $d;
+ closedir $d;
}
return @got;
}
C<glob>, the C<< <*> >> glob operator is overridden as well.
In a similar fashion, overriding the C<readline> function also overrides
-the equivalent I/O operator C<< <FILEHANDLE> >>.
+the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding
+C<readpipe> also overrides the operators C<``> and C<qx//>.
Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.
=head2 Autoloading
+X<autoloading> X<AUTOLOAD>
If you call a subroutine that is undefined, you would ordinarily
get an immediate, fatal error complaining that the subroutine doesn't
variable of the same package as the C<AUTOLOAD> routine. The name
is not passed as an ordinary argument because, er, well, just
because, that's why. (As an exception, a method call to a nonexistent
-C<import> or C<unimport> method is just skipped instead.)
+C<import> or C<unimport> method is just skipped instead. Also, if
+the AUTOLOAD subroutine is an XSUB, C<$AUTOLOAD> is not populated;
+instead, you should call L<< C<SvPVX>E<sol>C<SvCUR>|perlapi >> on the
+C<CV> for C<AUTOLOAD> to retrieve the method name.)
+
Many C<AUTOLOAD> routines load in a definition for the requested
subroutine using eval(), then execute that subroutine using a special
functions to Perl code in L<perlxs>.
=head2 Subroutine Attributes
+X<attribute> X<subroutine, attribute> X<attrs>
A subroutine declaration or definition may have a list of attributes
associated with it. If such an attribute list is present, it is
Examples of valid syntax (even though the attributes are unknown):
- sub fnord (&\%) : switch(10,foo(7,3)) : expensive ;
- sub plugh () : Ugly('\(") :Bad ;
+ sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
+ sub plugh () : Ugly('\(") :Bad;
sub xyzzy : _5x5 { ... }
Examples of invalid syntax:
- sub fnord : switch(10,foo() ; # ()-string not balanced
- sub snoid : Ugly('(') ; # ()-string not balanced
- sub xyzzy : 5x5 ; # "5x5" not a valid identifier
- sub plugh : Y2::north ; # "Y2::north" not a simple identifier
- sub snurt : foo + bar ; # "+" not a colon or space
+ sub fnord : switch(10,foo(); # ()-string not balanced
+ sub snoid : Ugly('('); # ()-string not balanced
+ sub xyzzy : 5x5; # "5x5" not a valid identifier
+ sub plugh : Y2::north; # "Y2::north" not a simple identifier
+ sub snurt : foo + bar; # "+" not a colon or space
The attribute list is passed as a list of constant strings to the code
which associates them with the subroutine. In particular, the second example
See L<perlembed> if you'd like to learn about calling Perl subroutines from C.
See L<perlmod> to learn about bundling up your functions in separate files.
See L<perlmodlib> to learn what library modules come standard on your system.
-See L<perltoot> to learn how to make object method calls.
+See L<perlootut> to learn how to make object method calls.