This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Improve pod2man diagnostic when NAME is invalid
[perl5.git] / pod / perlref.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlref - Perl references and nested data structures
4
5=head1 DESCRIPTION
6
cb1a09d0
AD
7Before release 5 of Perl it was difficult to represent complex data
8structures, because all references had to be symbolic, and even that was
9difficult to do when you wanted to refer to a variable rather than a
5f05dabc 10symbol table entry. Perl not only makes it easier to use symbolic
cb1a09d0 11references to variables, but lets you have "hard" references to any piece
5f05dabc 12of data. Any scalar may hold a hard reference. Because arrays and hashes
cb1a09d0
AD
13contain scalars, you can now easily build arrays of arrays, arrays of
14hashes, hashes of arrays, arrays of hashes of functions, and so on.
a0d0e21e
LW
15
16Hard references are smart--they keep track of reference counts for you,
17automatically freeing the thing referred to when its reference count
6309d9d9
PP
18goes to zero. (Note: The reference counts for values in self-referential
19or cyclic data structures may not go to zero without a little help; see
20L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.
21If that thing happens to be an object, the object is
a0d0e21e
LW
22destructed. See L<perlobj> for more about objects. (In a sense,
23everything in Perl is an object, but we usually reserve the word for
24references to objects that have been officially "blessed" into a class package.)
25
6309d9d9 26
a0d0e21e 27A symbolic reference contains the name of a variable, just as a
5f05dabc 28symbolic link in the filesystem contains merely the name of a file.
a0d0e21e
LW
29The C<*glob> notation is a kind of symbolic reference. Hard references
30are more like hard links in the file system: merely another way
31at getting at the same underlying object, irrespective of its name.
32
33"Hard" references are easy to use in Perl. There is just one
34overriding principle: Perl does no implicit referencing or
35dereferencing. When a scalar is holding a reference, it always behaves
36as a scalar. It doesn't magically start being an array or a hash
37unless you tell it so explicitly by dereferencing it.
38
39References can be constructed several ways.
40
41=over 4
42
43=item 1.
44
45By using the backslash operator on a variable, subroutine, or value.
46(This works much like the & (address-of) operator works in C.) Note
5f05dabc 47that this typically creates I<ANOTHER> reference to a variable, because
a0d0e21e
LW
48there's already a reference to the variable in the symbol table. But
49the symbol table reference might go away, and you'll still have the
50reference that the backslash returned. Here are some examples:
51
52 $scalarref = \$foo;
53 $arrayref = \@ARGV;
54 $hashref = \%ENV;
55 $coderef = \&handler;
55497cff 56 $globref = \*foo;
cb1a09d0 57
5f05dabc 58It isn't possible to create a true reference to an IO handle (filehandle or
36477c24 59dirhandle) using the backslash operator. See the explanation of the
5f05dabc
PP
60*foo{THING} syntax below. (However, you're apt to find Perl code
61out there using globrefs as though they were IO handles, which is
62grandfathered into continued functioning.)
a0d0e21e
LW
63
64=item 2.
65
66A reference to an anonymous array can be constructed using square
67brackets:
68
69 $arrayref = [1, 2, ['a', 'b', 'c']];
70
71Here we've constructed a reference to an anonymous array of three elements
72whose final element is itself reference to another anonymous array of three
73elements. (The multidimensional syntax described later can be used to
184e9718 74access this. For example, after the above, C<$arrayref-E<gt>[2][1]> would have
a0d0e21e
LW
75the value "b".)
76
cb1a09d0
AD
77Note that taking a reference to an enumerated list is not the same
78as using square brackets--instead it's the same as creating
79a list of references!
80
58e0a6ae
GS
81 @list = (\$a, \@b, \%c);
82 @list = \($a, @b, %c); # same thing!
83
84As a special case, C<\(@foo)> returns a list of references to the contents
85of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>.
cb1a09d0 86
a0d0e21e
LW
87=item 3.
88
89A reference to an anonymous hash can be constructed using curly
90brackets:
91
92 $hashref = {
93 'Adam' => 'Eve',
94 'Clyde' => 'Bonnie',
95 };
96
97Anonymous hash and array constructors can be intermixed freely to
98produce as complicated a structure as you want. The multidimensional
99syntax described below works for these too. The values above are
100literals, but variables and expressions would work just as well, because
101assignment operators in Perl (even within local() or my()) are executable
102statements, not compile-time declarations.
103
104Because curly brackets (braces) are used for several other things
105including BLOCKs, you may occasionally have to disambiguate braces at the
106beginning of a statement by putting a C<+> or a C<return> in front so
107that Perl realizes the opening brace isn't starting a BLOCK. The economy and
108mnemonic value of using curlies is deemed worth this occasional extra
109hassle.
110
111For example, if you wanted a function to make a new hash and return a
112reference to it, you have these options:
113
114 sub hashem { { @_ } } # silently wrong
115 sub hashem { +{ @_ } } # ok
116 sub hashem { return { @_ } } # ok
117
118=item 4.
119
120A reference to an anonymous subroutine can be constructed by using
121C<sub> without a subname:
122
123 $coderef = sub { print "Boink!\n" };
124
125Note the presence of the semicolon. Except for the fact that the code
126inside isn't executed immediately, a C<sub {}> is not so much a
127declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no
128matter how many times you execute that line (unless you're in an
129C<eval("...")>), C<$coderef> will still have a reference to the I<SAME>
130anonymous subroutine.)
131
748a9306
LW
132Anonymous subroutines act as closures with respect to my() variables,
133that is, variables visible lexically within the current scope. Closure
134is a notion out of the Lisp world that says if you define an anonymous
135function in a particular lexical context, it pretends to run in that
136context even when it's called outside of the context.
137
138In human terms, it's a funny way of passing arguments to a subroutine when
139you define it as well as when you call it. It's useful for setting up
140little bits of code to run later, such as callbacks. You can even
141do object-oriented stuff with it, though Perl provides a different
142mechanism to do that already--see L<perlobj>.
143
144You can also think of closure as a way to write a subroutine template without
145using eval. (In fact, in version 5.000, eval was the I<only> way to get
146closures. You may wish to use "require 5.001" if you use closures.)
147
148Here's a small example of how closures works:
149
150 sub newprint {
151 my $x = shift;
152 return sub { my $y = shift; print "$x, $y!\n"; };
a0d0e21e 153 }
748a9306
LW
154 $h = newprint("Howdy");
155 $g = newprint("Greetings");
156
157 # Time passes...
158
159 &$h("world");
160 &$g("earthlings");
a0d0e21e 161
748a9306
LW
162This prints
163
164 Howdy, world!
165 Greetings, earthlings!
166
167Note particularly that $x continues to refer to the value passed into
cb1a09d0 168newprint() I<despite> the fact that the "my $x" has seemingly gone out of
748a9306
LW
169scope by the time the anonymous subroutine runs. That's what closure
170is all about.
171
5f05dabc 172This applies to only lexical variables, by the way. Dynamic variables
748a9306
LW
173continue to work as they have always worked. Closure is not something
174that most Perl programmers need trouble themselves about to begin with.
a0d0e21e
LW
175
176=item 5.
177
178References are often returned by special subroutines called constructors.
748a9306 179Perl objects are just references to a special kind of object that happens to know
a0d0e21e
LW
180which package it's associated with. Constructors are just special
181subroutines that know how to create that association. They do so by
182starting with an ordinary reference, and it remains an ordinary reference
183even while it's also being an object. Constructors are customarily
184named new(), but don't have to be:
185
186 $objref = new Doggie (Tail => 'short', Ears => 'long');
187
188=item 6.
189
190References of the appropriate type can spring into existence if you
5f05dabc 191dereference them in a context that assumes they exist. Because we haven't
a0d0e21e
LW
192talked about dereferencing yet, we can't show you any examples yet.
193
cb1a09d0
AD
194=item 7.
195
55497cff
PP
196A reference can be created by using a special syntax, lovingly known as
197the *foo{THING} syntax. *foo{THING} returns a reference to the THING
198slot in *foo (which is the symbol table entry which holds everything
199known as foo).
cb1a09d0 200
55497cff
PP
201 $scalarref = *foo{SCALAR};
202 $arrayref = *ARGV{ARRAY};
203 $hashref = *ENV{HASH};
204 $coderef = *handler{CODE};
36477c24 205 $ioref = *STDIN{IO};
55497cff
PP
206 $globref = *foo{GLOB};
207
36477c24
PP
208All of these are self-explanatory except for *foo{IO}. It returns the
209IO handle, used for file handles (L<perlfunc/open>), sockets
210(L<perlfunc/socket> and L<perlfunc/socketpair>), and directory handles
211(L<perlfunc/opendir>). For compatibility with previous versions of
212Perl, *foo{FILEHANDLE} is a synonym for *foo{IO}.
55497cff 213
5f05dabc
PP
214*foo{THING} returns undef if that particular THING hasn't been used yet,
215except in the case of scalars. *foo{SCALAR} returns a reference to an
216anonymous scalar if $foo hasn't been used yet. This might change in a
217future release.
218
219The use of *foo{IO} is the best way to pass bareword filehandles into or
220out of subroutines, or to store them in larger data structures.
36477c24
PP
221
222 splutter(*STDOUT{IO});
cb1a09d0
AD
223 sub splutter {
224 my $fh = shift;
225 print $fh "her um well a hmmm\n";
226 }
227
36477c24 228 $rec = get_rec(*STDIN{IO});
cb1a09d0
AD
229 sub get_rec {
230 my $fh = shift;
231 return scalar <$fh>;
232 }
233
5f05dabc
PP
234Beware, though, that you can't do this with a routine which is going to
235open the filehandle for you, because *HANDLE{IO} will be undef if HANDLE
236hasn't been used yet. Use \*HANDLE for that sort of thing instead.
237
238Using \*HANDLE (or *HANDLE) is another way to use and store non-bareword
a6006777
PP
239filehandles (before perl version 5.002 it was the only way). The two
240methods are largely interchangeable, you can do
5f05dabc
PP
241
242 splutter(\*STDOUT);
243 $rec = get_rec(\*STDIN);
244
245with the above subroutine definitions.
55497cff 246
a0d0e21e
LW
247=back
248
249That's it for creating references. By now you're probably dying to
250know how to use references to get back to your long-lost data. There
251are several basic methods.
252
253=over 4
254
255=item 1.
256
6309d9d9
PP
257Anywhere you'd put an identifier (or chain of identifiers) as part
258of a variable or subroutine name, you can replace the identifier with
259a simple scalar variable containing a reference of the correct type:
a0d0e21e
LW
260
261 $bar = $$scalarref;
262 push(@$arrayref, $filename);
263 $$arrayref[0] = "January";
264 $$hashref{"KEY"} = "VALUE";
265 &$coderef(1,2,3);
cb1a09d0 266 print $globref "output\n";
a0d0e21e
LW
267
268It's important to understand that we are specifically I<NOT> dereferencing
269C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the
270scalar variable happens I<BEFORE> it does any key lookups. Anything more
271complicated than a simple scalar variable must use methods 2 or 3 below.
272However, a "simple scalar" includes an identifier that itself uses method
2731 recursively. Therefore, the following prints "howdy".
274
275 $refrefref = \\\"howdy";
276 print $$$$refrefref;
277
278=item 2.
279
6309d9d9
PP
280Anywhere you'd put an identifier (or chain of identifiers) as part of a
281variable or subroutine name, you can replace the identifier with a
282BLOCK returning a reference of the correct type. In other words, the
283previous examples could be written like this:
a0d0e21e
LW
284
285 $bar = ${$scalarref};
286 push(@{$arrayref}, $filename);
287 ${$arrayref}[0] = "January";
288 ${$hashref}{"KEY"} = "VALUE";
289 &{$coderef}(1,2,3);
36477c24 290 $globref->print("output\n"); # iff IO::Handle is loaded
a0d0e21e
LW
291
292Admittedly, it's a little silly to use the curlies in this case, but
293the BLOCK can contain any arbitrary expression, in particular,
294subscripted expressions:
295
296 &{ $dispatch{$index} }(1,2,3); # call correct routine
297
298Because of being able to omit the curlies for the simple case of C<$$x>,
299people often make the mistake of viewing the dereferencing symbols as
300proper operators, and wonder about their precedence. If they were,
5f05dabc 301though, you could use parentheses instead of braces. That's not the case.
a0d0e21e
LW
302Consider the difference below; case 0 is a short-hand version of case 1,
303I<NOT> case 2:
304
305 $$hashref{"KEY"} = "VALUE"; # CASE 0
306 ${$hashref}{"KEY"} = "VALUE"; # CASE 1
307 ${$hashref{"KEY"}} = "VALUE"; # CASE 2
308 ${$hashref->{"KEY"}} = "VALUE"; # CASE 3
309
310Case 2 is also deceptive in that you're accessing a variable
311called %hashref, not dereferencing through $hashref to the hash
312it's presumably referencing. That would be case 3.
313
314=item 3.
315
316The case of individual array elements arises often enough that it gets
317cumbersome to use method 2. As a form of syntactic sugar, the two
318lines like that above can be written:
319
320 $arrayref->[0] = "January";
748a9306 321 $hashref->{"KEY"} = "VALUE";
a0d0e21e
LW
322
323The left side of the array can be any expression returning a reference,
324including a previous dereference. Note that C<$array[$x]> is I<NOT> the
325same thing as C<$array-E<gt>[$x]> here:
326
327 $array[$x]->{"foo"}->[0] = "January";
328
329This is one of the cases we mentioned earlier in which references could
330spring into existence when in an lvalue context. Before this
331statement, C<$array[$x]> may have been undefined. If so, it's
332automatically defined with a hash reference so that we can look up
333C<{"foo"}> in it. Likewise C<$array[$x]-E<gt>{"foo"}> will automatically get
334defined with an array reference so that we can look up C<[0]> in it.
335
336One more thing here. The arrow is optional I<BETWEEN> brackets
337subscripts, so you can shrink the above down to
338
339 $array[$x]{"foo"}[0] = "January";
340
341Which, in the degenerate case of using only ordinary arrays, gives you
342multidimensional arrays just like C's:
343
344 $score[$x][$y][$z] += 42;
345
346Well, okay, not entirely like C's arrays, actually. C doesn't know how
347to grow its arrays on demand. Perl does.
348
349=item 4.
350
351If a reference happens to be a reference to an object, then there are
352probably methods to access the things referred to, and you should probably
353stick to those methods unless you're in the class package that defines the
354object's methods. In other words, be nice, and don't violate the object's
355encapsulation without a very good reason. Perl does not enforce
356encapsulation. We are not totalitarians here. We do expect some basic
357civility though.
358
359=back
360
361The ref() operator may be used to determine what type of thing the
362reference is pointing to. See L<perlfunc>.
363
364The bless() operator may be used to associate a reference with a package
365functioning as an object class. See L<perlobj>.
366
5f05dabc 367A typeglob may be dereferenced the same way a reference can, because
a0d0e21e
LW
368the dereference syntax always indicates the kind of reference desired.
369So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.
370
371Here's a trick for interpolating a subroutine call into a string:
372
cb1a09d0
AD
373 print "My sub returned @{[mysub(1,2,3)]} that time.\n";
374
375The way it works is that when the C<@{...}> is seen in the double-quoted
376string, it's evaluated as a block. The block creates a reference to an
377anonymous array containing the results of the call to C<mysub(1,2,3)>. So
378the whole block returns a reference to an array, which is then
379dereferenced by C<@{...}> and stuck into the double-quoted string. This
380chicanery is also useful for arbitrary expressions:
a0d0e21e 381
184e9718 382 print "That yields @{[$n + 5]} widgets\n";
a0d0e21e
LW
383
384=head2 Symbolic references
385
386We said that references spring into existence as necessary if they are
387undefined, but we didn't say what happens if a value used as a
388reference is already defined, but I<ISN'T> a hard reference. If you
389use it as a reference in this case, it'll be treated as a symbolic
390reference. That is, the value of the scalar is taken to be the I<NAME>
391of a variable, rather than a direct link to a (possibly) anonymous
392value.
393
394People frequently expect it to work like this. So it does.
395
396 $name = "foo";
397 $$name = 1; # Sets $foo
398 ${$name} = 2; # Sets $foo
399 ${$name x 2} = 3; # Sets $foofoo
400 $name->[0] = 4; # Sets $foo[0]
401 @$name = (); # Clears @foo
402 &$name(); # Calls &foo() (as in Perl 4)
403 $pack = "THAT";
404 ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval
405
406This is very powerful, and slightly dangerous, in that it's possible
407to intend (with the utmost sincerity) to use a hard reference, and
408accidentally use a symbolic reference instead. To protect against
409that, you can say
410
411 use strict 'refs';
412
413and then only hard references will be allowed for the rest of the enclosing
414block. An inner block may countermand that with
415
416 no strict 'refs';
417
418Only package variables are visible to symbolic references. Lexical
419variables (declared with my()) aren't in a symbol table, and thus are
420invisible to this mechanism. For example:
421
422 local($value) = 10;
423 $ref = \$value;
424 {
425 my $value = 20;
426 print $$ref;
427 }
428
429This will still print 10, not 20. Remember that local() affects package
430variables, which are all "global" to the package.
431
748a9306
LW
432=head2 Not-so-symbolic references
433
a6006777
PP
434A new feature contributing to readability in perl version 5.001 is that the
435brackets around a symbolic reference behave more like quotes, just as they
748a9306
LW
436always have within a string. That is,
437
438 $push = "pop on ";
439 print "${push}over";
440
441has always meant to print "pop on over", despite the fact that push is
442a reserved word. This has been generalized to work the same outside
443of quotes, so that
444
445 print ${push} . "over";
446
447and even
448
449 print ${ push } . "over";
450
451will have the same effect. (This would have been a syntax error in
a6006777 452Perl 5.000, though Perl 4 allowed it in the spaceless form.) Note that this
748a9306
LW
453construct is I<not> considered to be a symbolic reference when you're
454using strict refs:
455
456 use strict 'refs';
457 ${ bareword }; # Okay, means $bareword.
458 ${ "bareword" }; # Error, symbolic reference.
459
460Similarly, because of all the subscripting that is done using single
461words, we've applied the same rule to any bareword that is used for
462subscripting a hash. So now, instead of writing
463
464 $array{ "aaa" }{ "bbb" }{ "ccc" }
465
5f05dabc 466you can write just
748a9306
LW
467
468 $array{ aaa }{ bbb }{ ccc }
469
470and not worry about whether the subscripts are reserved words. In the
471rare event that you do wish to do something like
472
473 $array{ shift }
474
475you can force interpretation as a reserved word by adding anything that
476makes it more than a bareword:
477
478 $array{ shift() }
479 $array{ +shift }
480 $array{ shift @_ }
481
482The B<-w> switch will warn you if it interprets a reserved word as a string.
5f05dabc 483But it will no longer warn you about using lowercase words, because the
748a9306
LW
484string is effectively quoted.
485
cb1a09d0 486=head1 WARNING
748a9306
LW
487
488You may not (usefully) use a reference as the key to a hash. It will be
489converted into a string:
490
491 $x{ \$a } = $a;
492
493If you try to dereference the key, it won't do a hard dereference, and
184e9718 494you won't accomplish what you're attempting. You might want to do something
cb1a09d0 495more like
748a9306 496
cb1a09d0
AD
497 $r = \@a;
498 $x{ $r } = $r;
499
500And then at least you can use the values(), which will be
501real refs, instead of the keys(), which won't.
502
503=head1 SEE ALSO
a0d0e21e
LW
504
505Besides the obvious documents, source code can be instructive.
506Some rather pathological examples of the use of references can be found
507in the F<t/op/ref.t> regression test in the Perl source directory.
cb1a09d0
AD
508
509See also L<perldsc> and L<perllol> for how to use references to create
510complex data structures, and L<perlobj> for how to use them to create
511objects.