automatically freeing the thing referred to when its reference count goes
to zero. (Reference counts for values in self-referential or
cyclic data structures may not go to zero without a little help; see
-L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.)
+L</"Circular References"> for a detailed explanation.)
If that thing happens to be an object, the object is destructed. See
L<perlobj> for more about objects. (In a sense, everything in Perl is an
object, but we usually reserve the word for references to objects that
X<reference, hard> X<hard reference>
References are easy to use in Perl. There is just one overriding
-principle: Perl does no implicit referencing or dereferencing. When a
-scalar is holding a reference, it always behaves as a simple scalar. It
-doesn't magically start being an array or hash or subroutine; you have to
+principle: in general, Perl does no implicit referencing or dereferencing.
+When a scalar is holding a reference, it always behaves as a simple scalar.
+It doesn't magically start being an array or hash or subroutine; you have to
tell it explicitly to do so, by dereferencing it.
=head2 Making References
X<\> X<backslash>
By using the backslash operator on a variable, subroutine, or value.
-(This works much like the & (address-of) operator in C.)
+(This works much like the & (address-of) operator in C.)
This typically creates I<another> reference to a variable, because
there's already a reference to the variable in the symbol table. But
the symbol table reference might go away, and you'll still have the
a list of references!
@list = (\$a, \@b, \%c);
- @list = \($a, @b, %c); # same thing!
+ @list = \($a, @b, %c); # same thing!
As a special case, C<\(@foo)> returns a list of references to the contents
of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>,
brackets:
$hashref = {
- 'Adam' => 'Eve',
- 'Clyde' => 'Bonnie',
+ 'Adam' => 'Eve',
+ 'Clyde' => 'Bonnie',
};
Anonymous hash and array composers like these can be intermixed freely to
On the other hand, if you want the other meaning, you can do this:
- sub showem { { @_ } } # ambiguous (currently ok, but may change)
+ sub showem { { @_ } } # ambiguous (currently ok,
+ # but may change)
sub showem { {; @_ } } # ok
sub showem { { return @_ } } # ok
closures work:
sub newprint {
- my $x = shift;
- return sub { my $y = shift; print "$x, $y!\n"; };
+ my $x = shift;
+ return sub { my $y = shift; print "$x, $y!\n"; };
}
$h = newprint("Howdy");
$g = newprint("Greetings");
$ioref = *STDIN{IO};
$globref = *foo{GLOB};
$formatref = *foo{FORMAT};
+ $globname = *foo{NAME}; # "foo"
+ $pkgname = *foo{PACKAGE}; # "main"
-All of these are self-explanatory except for C<*foo{IO}>. It returns
+Most of these are self-explanatory, but C<*foo{IO}>
+deserves special attention. It returns
the IO handle, used for file handles (L<perlfunc/open>), sockets
(L<perlfunc/socket> and L<perlfunc/socketpair>), and directory
handles (L<perlfunc/opendir>). For compatibility with previous
versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it
-is deprecated as of 5.8.0. If deprecation warnings are in effect, it will warn
-of its use.
+is discouraged, to encourage a consistent use of one name: IO. On perls
+between v5.8 and v5.22, it will issue a deprecation warning, but this
+deprecation has since been rescinded.
C<*foo{THING}> returns undef if that particular THING hasn't been used yet,
except in the case of scalars. C<*foo{SCALAR}> returns a reference to an
anonymous scalar if $foo hasn't been used yet. This might change in a
future release.
+C<*foo{NAME}> and C<*foo{PACKAGE}> are the exception, in that they return
+strings, rather than references. These return the package and name of the
+typeglob itself, rather than one that has been assigned to it. So, after
+C<*foo=*Foo::bar>, C<*foo> will become "*Foo::bar" when used as a string,
+but C<*foo{PACKAGE}> and C<*foo{NAME}> will continue to produce "main" and
+"foo", respectively.
+
C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in
L<perldata/"Typeglobs and Filehandles"> for passing filehandles
into or out of subroutines, or storing into larger data structures.
value to a scalar instead of a typeglob as we do in the examples
below, there's no risk of that happening.
- splutter(*STDOUT); # pass the whole glob
- splutter(*STDOUT{IO}); # pass both file and dir handles
+ splutter(*STDOUT); # pass the whole glob
+ splutter(*STDOUT{IO}); # pass both file and dir handles
sub splutter {
- my $fh = shift;
- print $fh "her um well a hmmm\n";
+ my $fh = shift;
+ print $fh "her um well a hmmm\n";
}
- $rec = get_rec(*STDIN); # pass the whole glob
+ $rec = get_rec(*STDIN); # pass the whole glob
$rec = get_rec(*STDIN{IO}); # pass both file and dir handles
sub get_rec {
- my $fh = shift;
- return scalar <$fh>;
+ my $fh = shift;
+ return scalar <$fh>;
}
=back
the BLOCK can contain any arbitrary expression, in particular,
subscripted expressions:
- &{ $dispatch{$index} }(1,2,3); # call correct routine
+ &{ $dispatch{$index} }(1,2,3); # call correct routine
Because of being able to omit the curlies for the simple case of C<$$x>,
people often make the mistake of viewing the dereferencing symbols as
Consider the difference below; case 0 is a short-hand version of case 1,
I<not> case 2:
- $$hashref{"KEY"} = "VALUE"; # CASE 0
- ${$hashref}{"KEY"} = "VALUE"; # CASE 1
- ${$hashref{"KEY"}} = "VALUE"; # CASE 2
- ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
+ $$hashref{"KEY"} = "VALUE"; # CASE 0
+ ${$hashref}{"KEY"} = "VALUE"; # CASE 1
+ ${$hashref{"KEY"}} = "VALUE"; # CASE 2
+ ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
Case 2 is also deceptive in that you're accessing a variable
called %hashref, not dereferencing through $hashref to the hash
X<reference, numeric context>
if ($ref1 == $ref2) { # cheap numeric compare of references
- print "refs 1 and 2 refer to the same thing\n";
+ print "refs 1 and 2 refer to the same thing\n";
}
Using a reference as a string produces both its referent's type,
print "That yields ${\($n + 5)} widgets\n";
+=head2 Circular References
+X<circular reference> X<reference, circular>
+
+It is possible to create a "circular reference" in Perl, which can lead
+to memory leaks. A circular reference occurs when two references
+contain a reference to each other, like this:
+
+ my $foo = {};
+ my $bar = { foo => $foo };
+ $foo->{bar} = $bar;
+
+You can also create a circular reference with a single variable:
+
+ my $foo;
+ $foo = \$foo;
+
+In this case, the reference count for the variables will never reach 0,
+and the references will never be garbage-collected. This can lead to
+memory leaks.
+
+Because objects in Perl are implemented as references, it's possible to
+have circular references with objects as well. Imagine a TreeNode class
+where each node references its parent and child nodes. Any node with a
+parent will be part of a circular reference.
+
+You can break circular references by creating a "weak reference". A
+weak reference does not increment the reference count for a variable,
+which means that the object can go out of scope and be destroyed. You
+can weaken a reference with the C<weaken> function exported by the
+L<Scalar::Util> module.
+
+Here's how we can make the first example safer:
+
+ use Scalar::Util 'weaken';
+
+ my $foo = {};
+ my $bar = { foo => $foo };
+ $foo->{bar} = $bar;
+
+ weaken $foo->{bar};
+
+The reference from C<$foo> to C<$bar> has been weakened. When the
+C<$bar> variable goes out of scope, it will be garbage-collected. The
+next time you look at the value of the C<< $foo->{bar} >> key, it will
+be C<undef>.
+
+This action at a distance can be confusing, so you should be careful
+with your use of weaken. You should weaken the reference in the
+variable that will go out of scope I<first>. That way, the longer-lived
+variable will contain the expected reference until it goes out of
+scope.
+
=head2 Symbolic references
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>
People frequently expect it to work like this. So it does.
$name = "foo";
- $$name = 1; # Sets $foo
- ${$name} = 2; # Sets $foo
- ${$name x 2} = 3; # Sets $foofoo
- $name->[0] = 4; # Sets $foo[0]
- @$name = (); # Clears @foo
- &$name(); # Calls &foo() (as in Perl 4)
+ $$name = 1; # Sets $foo
+ ${$name} = 2; # Sets $foo
+ ${$name x 2} = 3; # Sets $foofoo
+ $name->[0] = 4; # Sets $foo[0]
+ @$name = (); # Clears @foo
+ &$name(); # Calls &foo()
$pack = "THAT";
- ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
+ ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
This is powerful, and slightly dangerous, in that it's possible
to intend (with the utmost sincerity) to use a hard reference, and
local $value = 10;
$ref = "value";
{
- my $value = 20;
- print $$ref;
+ my $value = 20;
+ print $$ref;
}
This will still print 10, not 20. Remember that local() affects package
=head2 Not-so-symbolic references
-A new feature contributing to readability in perl version 5.001 is that the
-brackets around a symbolic reference behave more like quotes, just as they
-always have within a string. That is,
+Brackets around a symbolic reference can simply
+serve to isolate an identifier or variable name from the rest of an
+expression, just as they always have within a string. For example,
$push = "pop on ";
print "${push}over";
has always meant to print "pop on over", even though push is
-a reserved word. This has been generalized to work the same outside
-of quotes, so that
+a reserved word. This is generalized to work the same
+without the enclosing double quotes, so that
print ${push} . "over";
print ${ push } . "over";
-will have the same effect. (This would have been a syntax error in
-Perl 5.000, though Perl 4 allowed it in the spaceless form.) This
+will have the same effect. This
construct is I<not> considered to be a symbolic reference when you're
using strict refs:
use strict 'refs';
- ${ bareword }; # Okay, means $bareword.
- ${ "bareword" }; # Error, symbolic reference.
+ ${ bareword }; # Okay, means $bareword.
+ ${ "bareword" }; # Error, symbolic reference.
-Similarly, because of all the subscripting that is done using single
-words, we've applied the same rule to any bareword that is used for
-subscripting a hash. So now, instead of writing
+Similarly, because of all the subscripting that is done using single words,
+the same rule applies to any bareword that is used for subscripting a hash.
+So now, instead of writing
$array{ "aaa" }{ "bbb" }{ "ccc" }
The red() and green() functions would be similar. To create these,
we'll assign a closure to a typeglob of the name of the function we're
-trying to build.
+trying to build.
@colors = qw(red blue green yellow orange purple violet);
for my $name (@colors) {
- no strict 'refs'; # allow symbol table manipulation
+ no strict 'refs'; # allow symbol table manipulation
*$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };
- }
+ }
Now all those different functions appear to exist independently. You can
call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on
putting the whole loop of assignments within a BEGIN block, forcing it
to occur during compilation.
-Access to lexicals that change over type--like those in the C<for> loop
-above--only works with closures, not general subroutines. In the general
-case, then, named subroutines do not nest properly, although anonymous
-ones do. Thus is because named subroutines are created (and capture any
-outer lexicals) only once at compile time, whereas anonymous subroutines
-get to capture each time you execute the 'sub' operator. If you are
-accustomed to using nested subroutines in other programming languages with
-their own private variables, you'll have to work at it a bit in Perl. The
-intuitive coding of this type of thing incurs mysterious warnings about
-"will not stay shared". For example, this won't work:
+Access to lexicals that change over time--like those in the C<for> loop
+above, basically aliases to elements from the surrounding lexical scopes--
+only works with anonymous subs, not with named subroutines. Generally
+said, named subroutines do not nest properly and should only be declared
+in the main package scope.
+
+This is because named subroutines are created at compile time so their
+lexical variables get assigned to the parent lexicals from the first
+execution of the parent block. If a parent scope is entered a second
+time, its lexicals are created again, while the nested subs still
+reference the old ones.
+
+Anonymous subroutines get to capture each time you execute the C<sub>
+operator, as they are created on the fly. If you are accustomed to using
+nested subroutines in other programming languages with their own private
+variables, you'll have to work at it a bit in Perl. The intuitive coding
+of this type of thing incurs mysterious warnings about "will not stay
+shared" due to the reasons explained above.
+For example, this won't work:
sub outer {
my $x = $_[0] + 35;
}
Now inner() can only be called from within outer(), because of the
-temporary assignments of the closure (anonymous subroutine). But when
-it does, it has normal access to the lexical variable $x from the scope
-of outer().
+temporary assignments of the anonymous subroutine. But when it does,
+it has normal access to the lexical variable $x from the scope of
+outer() at the time outer is invoked.
This has the interesting effect of creating a function local to another
function, something not normally supported in Perl.
-=head1 WARNING
+=head1 WARNING: Don't use references as hash keys
X<reference, string context> X<reference, use as hash key>
You may not (usefully) use a reference as the key to a hash. It will be
The standard Tie::RefHash module provides a convenient workaround to this.
+=head2 Postfix Dereference Syntax
+
+Beginning in v5.20.0, a postfix syntax for using references is
+available. It behaves as described in L</Using References>, but instead
+of a prefixed sigil, a postfixed sigil-and-star is used.
+
+For example:
+
+ $r = \@a;
+ @b = $r->@*; # equivalent to @$r or @{ $r }
+
+ $r = [ 1, [ 2, 3 ], 4 ];
+ $r->[1]->@*; # equivalent to @{ $r->[1] }
+
+In Perl 5.20 and 5.22, this syntax must be enabled with C<use feature
+'postderef'>. As of Perl 5.24, no feature declarations are required to make
+it available.
+
+Postfix dereference should work in all circumstances where block
+(circumfix) dereference worked, and should be entirely equivalent. This
+syntax allows dereferencing to be written and read entirely
+left-to-right. The following equivalencies are defined:
+
+ $sref->$*; # same as ${ $sref }
+ $aref->@*; # same as @{ $aref }
+ $aref->$#*; # same as $#{ $aref }
+ $href->%*; # same as %{ $href }
+ $cref->&*; # same as &{ $cref }
+ $gref->**; # same as *{ $gref }
+
+Note especially that C<< $cref->&* >> is I<not> equivalent to C<<
+$cref->() >>, and can serve different purposes.
+
+Glob elements can be extracted through the postfix dereferencing feature:
+
+ $gref->*{SCALAR}; # same as *{ $gref }{SCALAR}
+
+Postfix array and scalar dereferencing I<can> be used in interpolating
+strings (double quotes or the C<qq> operator), but only if the
+C<postderef_qq> feature is enabled.
+
+=head2 Postfix Reference Slicing
+
+Value slices of arrays and hashes may also be taken with postfix
+dereferencing notation, with the following equivalencies:
+
+ $aref->@[ ... ]; # same as @$aref[ ... ]
+ $href->@{ ... }; # same as @$href{ ... }
+
+Postfix key/value pair slicing, added in 5.20.0 and documented in
+L<the KeyE<sol>Value Hash Slices section of perldata|perldata/"Key/Value Hash
+Slices">, also behaves as expected:
+
+ $aref->%[ ... ]; # same as %$aref[ ... ]
+ $href->%{ ... }; # same as %$href{ ... }
+
+As with postfix array, postfix value slice dereferencing I<can> be used
+in interpolating strings (double quotes or the C<qq> operator), but only
+if the C<postderef_qq> L<feature> is enabled.
+
+=head2 Assigning to References
+
+Beginning in v5.22.0, the referencing operator can be assigned to. It
+performs an aliasing operation, so that the variable name referenced on the
+left-hand side becomes an alias for the thing referenced on the right-hand
+side:
+
+ \$a = \$b; # $a and $b now point to the same scalar
+ \&foo = \&bar; # foo() now means bar()
+
+This syntax must be enabled with C<use feature 'refaliasing'>. It is
+experimental, and will warn by default unless C<no warnings
+'experimental::refaliasing'> is in effect.
+
+These forms may be assigned to, and cause the right-hand side to be
+evaluated in scalar context:
+
+ \$scalar
+ \@array
+ \%hash
+ \&sub
+ \my $scalar
+ \my @array
+ \my %hash
+ \state $scalar # or @array, etc.
+ \our $scalar # etc.
+ \local $scalar # etc.
+ \local our $scalar # etc.
+ \$some_array[$index]
+ \$some_hash{$key}
+ \local $some_array[$index]
+ \local $some_hash{$key}
+ condition ? \$this : \$that[0] # etc.
+
+Slicing operations and parentheses cause
+the right-hand side to be evaluated in
+list context:
+
+ \@array[5..7]
+ (\@array[5..7])
+ \(@array[5..7])
+ \@hash{'foo','bar'}
+ (\@hash{'foo','bar'})
+ \(@hash{'foo','bar'})
+ (\$scalar)
+ \($scalar)
+ \(my $scalar)
+ \my($scalar)
+ (\@array)
+ (\%hash)
+ (\&sub)
+ \(&sub)
+ \($foo, @bar, %baz)
+ (\$foo, \@bar, \%baz)
+
+Each element on the right-hand side must be a reference to a datum of the
+right type. Parentheses immediately surrounding an array (and possibly
+also C<my>/C<state>/C<our>/C<local>) will make each element of the array an
+alias to the corresponding scalar referenced on the right-hand side:
+
+ \(@a) = \(@b); # @a and @b now have the same elements
+ \my(@a) = \(@b); # likewise
+ \(my @a) = \(@b); # likewise
+ push @a, 3; # but now @a has an extra element that @b lacks
+ \(@a) = (\$a, \$b, \$c); # @a now contains $a, $b, and $c
+
+Combining that form with C<local> and putting parentheses immediately
+around a hash are forbidden (because it is not clear what they should do):
+
+ \local(@array) = foo(); # WRONG
+ \(%hash) = bar(); # wRONG
+
+Assignment to references and non-references may be combined in lists and
+conditional ternary expressions, as long as the values on the right-hand
+side are the right type for each element on the left, though this may make
+for obfuscated code:
+
+ (my $tom, \my $dick, \my @harry) = (\1, \2, [1..3]);
+ # $tom is now \1
+ # $dick is now 2 (read-only)
+ # @harry is (1,2,3)
+
+ my $type = ref $thingy;
+ ($type ? $type eq 'ARRAY' ? \@foo : \$bar : $baz) = $thingy;
+
+The C<foreach> loop can also take a reference constructor for its loop
+variable, though the syntax is limited to one of the following, with an
+optional C<my>, C<state>, or C<our> after the backslash:
+
+ \$s
+ \@a
+ \%h
+ \&c
+
+No parentheses are permitted. This feature is particularly useful for
+arrays-of-arrays, or arrays-of-hashes:
+
+ foreach \my @a (@array_of_arrays) {
+ frobnicate($a[0], $a[-1]);
+ }
+
+ foreach \my %h (@array_of_hashes) {
+ $h{gelastic}++ if $h{type} eq 'funny';
+ }
+
+B<CAVEAT:> Aliasing does not work correctly with closures. If you try to
+alias lexical variables from an inner subroutine or C<eval>, the aliasing
+will only be visible within that inner sub, and will not affect the outer
+subroutine where the variables are declared. This bizarre behavior is
+subject to change.
+
+=head1 Declaring a Reference to a Variable
+
+Beginning in v5.26.0, the referencing operator can come after C<my>,
+C<state>, C<our>, or C<local>. This syntax must be enabled with C<use
+feature 'declared_refs'>. It is experimental, and will warn by default
+unless C<no warnings 'experimental::refaliasing'> is in effect.
+
+This feature makes these:
+
+ my \$x;
+ our \$y;
+
+equivalent to:
+
+ \my $x;
+ \our $x;
+
+It is intended mainly for use in assignments to references (see
+L</Assigning to References>, above). It also allows the backslash to be
+used on just some items in a list of declared variables:
+
+ my ($foo, \@bar, \%baz); # equivalent to: my $foo, \my(@bar, %baz);
+
=head1 SEE ALSO
Besides the obvious documents, source code can be instructive.
in the F<t/op/ref.t> regression test in the Perl source directory.
See also L<perldsc> and L<perllol> for how to use references to create
-complex data structures, and L<perltoot>, L<perlobj>, and L<perlbot>
+complex data structures, and L<perlootut> and L<perlobj>
for how to use them to create objects.