This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perldelta for c7ac81d9d79d22d7d1133b804e5f8dc4a641fe39
[perl5.git] / pod / perldsc.pod
CommitLineData
cb1a09d0 1=head1 NAME
d74e8afc 2X<data structure> X<complex data structure> X<struct>
4633a7c4 3
cb1a09d0 4perldsc - Perl Data Structures Cookbook
4633a7c4 5
cb1a09d0 6=head1 DESCRIPTION
4633a7c4 7
cb1e035e
BF
8Perl lets us have complex data structures. You can write something like
9this and all of a sudden, you'd have an array with three dimensions!
4633a7c4 10
5939083a
SF
11 for my $x (1 .. 10) {
12 for my $y (1 .. 10) {
13 for my $z (1 .. 10) {
6a40a726
SF
14 $AoA[$x][$y][$z] =
15 $x ** $y + $z;
16 }
17 }
4633a7c4
LW
18 }
19
20Alas, however simple this may appear, underneath it's a much more
21elaborate construct than meets the eye!
22
19799a22 23How do you print it out? Why can't you say just C<print @AoA>? How do
4633a7c4 24you sort it? How can you pass it to a function or get one of these back
d1be9408 25from a function? Is it an object? Can you save it to disk to read
4633a7c4 26back later? How do you access whole rows or columns of that matrix? Do
4973169d 27all the values have to be numeric?
4633a7c4
LW
28
29As you see, it's quite easy to become confused. While some small portion
30of the blame for this can be attributed to the reference-based
31implementation, it's really more due to a lack of existing documentation with
32examples designed for the beginner.
33
5f05dabc 34This document is meant to be a detailed but understandable treatment of the
35many different sorts of data structures you might want to develop. It
36should also serve as a cookbook of examples. That way, when you need to
37create one of these complex data structures, you can just pinch, pilfer, or
38purloin a drop-in example from here.
4633a7c4
LW
39
40Let's look at each of these possible constructs in detail. There are separate
28757baa 41sections on each of the following:
4633a7c4
LW
42
43=over 5
44
45=item * arrays of arrays
46
47=item * hashes of arrays
48
49=item * arrays of hashes
50
51=item * hashes of hashes
52
53=item * more elaborate constructs
54
4633a7c4
LW
55=back
56
5a964f20
TC
57But for now, let's look at general issues common to all
58these types of data structures.
4633a7c4
LW
59
60=head1 REFERENCES
d74e8afc 61X<reference> X<dereference> X<dereferencing> X<pointer>
4633a7c4 62
1f025261
ML
63The most important thing to understand about all data structures in
64Perl--including multidimensional arrays--is that even though they might
4633a7c4 65appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
5f05dabc 66one-dimensional. They can hold only scalar values (meaning a string,
4633a7c4
LW
67number, or a reference). They cannot directly contain other arrays or
68hashes, but instead contain I<references> to other arrays or hashes.
d74e8afc 69X<multidimensional array> X<array, multidimensional>
4633a7c4 70
d1be9408 71You can't use a reference to an array or hash in quite the same way that you
5f05dabc 72would a real array or hash. For C or C++ programmers unused to
73distinguishing between arrays and pointers to the same, this can be
74confusing. If so, just think of it as the difference between a structure
75and a pointer to a structure.
4633a7c4 76
ba555bf5
TH
77You can (and should) read more about references in L<perlref>.
78Briefly, references are rather like pointers that know what they
4633a7c4 79point to. (Objects are also a kind of reference, but we won't be needing
4973169d 80them right away--if ever.) This means that when you have something which
81looks to you like an access to a two-or-more-dimensional array and/or hash,
82what's really going on is that the base type is
4633a7c4
LW
83merely a one-dimensional entity that contains references to the next
84level. It's just that you can I<use> it as though it were a
85two-dimensional one. This is actually the way almost all C
86multidimensional arrays work as well.
87
6a40a726
SF
88 $array[7][12] # array of arrays
89 $array[7]{string} # array of hashes
90 $hash{string}[7] # hash of arrays
91 $hash{string}{'another string'} # hash of hashes
4633a7c4 92
5f05dabc 93Now, because the top level contains only references, if you try to print
4633a7c4
LW
94out your array in with a simple print() function, you'll get something
95that doesn't look very nice, like this:
96
5939083a 97 my @AoA = ( [2, 3], [4, 5, 7], [0] );
19799a22 98 print $AoA[1][2];
4633a7c4 99 7
19799a22 100 print @AoA;
4633a7c4
LW
101 ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
102
103
104That's because Perl doesn't (ever) implicitly dereference your variables.
105If you want to get at the thing a reference is referring to, then you have
106to do this yourself using either prefix typing indicators, like
107C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
108like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.
109
110=head1 COMMON MISTAKES
111
112The two most common mistakes made in constructing something like
113an array of arrays is either accidentally counting the number of
114elements or else taking a reference to the same memory location
115repeatedly. Here's the case where you just get the count instead
116of a nested array:
117
5939083a
SF
118 for my $i (1..10) {
119 my @array = somefunc($i);
6a40a726 120 $AoA[$i] = @array; # WRONG!
4973169d 121 }
4633a7c4 122
19799a22 123That's just the simple case of assigning an array to a scalar and getting
4633a7c4
LW
124its element count. If that's what you really and truly want, then you
125might do well to consider being a tad more explicit about it, like this:
126
5939083a
SF
127 for my $i (1..10) {
128 my @array = somefunc($i);
6a40a726 129 $counts[$i] = scalar @array;
4973169d 130 }
4633a7c4 131
84f709e7
JH
132Here's the case of taking a reference to the same memory location
133again and again:
4633a7c4 134
bd45a9fb
KW
135 # Either without strict or having an outer-scope my @array;
136 # declaration.
5939083a
SF
137
138 for my $i (1..10) {
6a40a726
SF
139 @array = somefunc($i);
140 $AoA[$i] = \@array; # WRONG!
84f709e7
JH
141 }
142
143So, what's the big problem with that? It looks right, doesn't it?
144After all, I just told you that you need an array of references, so by
145golly, you've made me one!
146
147Unfortunately, while this is true, it's still broken. All the references
148in @AoA refer to the I<very same place>, and they will therefore all hold
149whatever was last in @array! It's similar to the problem demonstrated in
150the following C program:
151
152 #include <pwd.h>
153 main() {
6a40a726
SF
154 struct passwd *getpwnam(), *rp, *dp;
155 rp = getpwnam("root");
156 dp = getpwnam("daemon");
84f709e7 157
6a40a726
SF
158 printf("daemon name is %s\nroot name is %s\n",
159 dp->pw_name, rp->pw_name);
84f709e7
JH
160 }
161
162Which will print
163
164 daemon name is daemon
165 root name is daemon
166
167The problem is that both C<rp> and C<dp> are pointers to the same location
168in memory! In C, you'd have to remember to malloc() yourself some new
169memory. In Perl, you'll want to use the array constructor C<[]> or the
170hash constructor C<{}> instead. Here's the right way to do the preceding
171broken code fragments:
d74e8afc 172X<[]> X<{}>
84f709e7 173
bd45a9fb
KW
174 # Either without strict or having an outer-scope my @array;
175 # declaration.
5939083a
SF
176
177 for my $i (1..10) {
6a40a726
SF
178 @array = somefunc($i);
179 $AoA[$i] = [ @array ];
4973169d 180 }
4633a7c4
LW
181
182The square brackets make a reference to a new array with a I<copy>
84f709e7
JH
183of what's in @array at the time of the assignment. This is what
184you want.
4633a7c4
LW
185
186Note that this will produce something similar, but it's
187much harder to read:
188
bd45a9fb
KW
189 # Either without strict or having an outer-scope my @array;
190 # declaration.
5939083a 191 for my $i (1..10) {
6a40a726
SF
192 @array = 0 .. $i;
193 @{$AoA[$i]} = @array;
4973169d 194 }
4633a7c4
LW
195
196Is it the same? Well, maybe so--and maybe not. The subtle difference
197is that when you assign something in square brackets, you know for sure
198it's always a brand new reference with a new I<copy> of the data.
b5d81ce9 199Something else could be going on in this new case with the C<@{$AoA[$i]}>
4633a7c4 200dereference on the left-hand-side of the assignment. It all depends on
19799a22
GS
201whether C<$AoA[$i]> had been undefined to start with, or whether it
202already contained a reference. If you had already populated @AoA with
4633a7c4
LW
203references, as in
204
19799a22 205 $AoA[3] = \@another_array;
4633a7c4
LW
206
207Then the assignment with the indirection on the left-hand-side would
208use the existing reference that was already there:
209
84f709e7 210 @{$AoA[3]} = @array;
4633a7c4
LW
211
212Of course, this I<would> have the "interesting" effect of clobbering
19799a22 213@another_array. (Have you ever noticed how when a programmer says
4633a7c4
LW
214something is "interesting", that rather than meaning "intriguing",
215they're disturbingly more apt to mean that it's "annoying",
216"difficult", or both? :-)
217
5f05dabc 218So just remember always to use the array or hash constructors with C<[]>
4633a7c4 219or C<{}>, and you'll be fine, although it's not always optimally
4973169d 220efficient.
4633a7c4
LW
221
222Surprisingly, the following dangerous-looking construct will
223actually work out fine:
224
5939083a 225 for my $i (1..10) {
84f709e7
JH
226 my @array = somefunc($i);
227 $AoA[$i] = \@array;
4973169d 228 }
4633a7c4
LW
229
230That's because my() is more of a run-time statement than it is a
231compile-time declaration I<per se>. This means that the my() variable is
232remade afresh each time through the loop. So even though it I<looks> as
233though you stored the same variable reference each time, you actually did
234not! This is a subtle distinction that can produce more efficient code at
235the risk of misleading all but the most experienced of programmers. So I
236usually advise against teaching it to beginners. In fact, except for
237passing arguments to functions, I seldom like to see the gimme-a-reference
238operator (backslash) used much at all in code. Instead, I advise
239beginners that they (and most of the rest of us) should try to use the
240much more easily understood constructors C<[]> and C<{}> instead of
241relying upon lexical (or dynamic) scoping and hidden reference-counting to
242do the right thing behind the scenes.
243
244In summary:
245
bd45a9fb
KW
246 $AoA[$i] = [ @array ]; # usually best
247 $AoA[$i] = \@array; # perilous; just how my() was that array?
248 @{ $AoA[$i] } = @array; # way too tricky for most programmers
4633a7c4
LW
249
250
4973169d 251=head1 CAVEAT ON PRECEDENCE
d74e8afc 252X<dereference, precedence> X<dereferencing, precedence>
4633a7c4 253
84f709e7 254Speaking of things like C<@{$AoA[$i]}>, the following are actually the
4633a7c4 255same thing:
d74e8afc 256X<< -> >>
4633a7c4 257
6a40a726
SF
258 $aref->[2][2] # clear
259 $$aref[2][2] # confusing
4633a7c4
LW
260
261That's because Perl's precedence rules on its five prefix dereferencers
262(which look like someone swearing: C<$ @ * % &>) make them bind more
263tightly than the postfix subscripting brackets or braces! This will no
264doubt come as a great shock to the C or C++ programmer, who is quite
265accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
266element of C<a>. That is, they first take the subscript, and only then
267dereference the thing at that subscript. That's fine in C, but this isn't C.
268
19799a22
GS
269The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
270the deref of $aref, making it take $aref as a reference to an
4633a7c4 271array, and then dereference that, and finally tell you the I<i'th> value
19799a22
GS
272of the array pointed to by $AoA. If you wanted the C notion, you'd have to
273write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
4633a7c4
LW
274before the leading C<$> dereferencer.
275
276=head1 WHY YOU SHOULD ALWAYS C<use strict>
277
278If this is starting to sound scarier than it's worth, relax. Perl has
279some features to help you avoid its most common pitfalls. The best
280way to avoid getting confused is to start every program like this:
281
282 #!/usr/bin/perl -w
283 use strict;
284
285This way, you'll be forced to declare all your variables with my() and
286also disallow accidental "symbolic dereferencing". Therefore if you'd done
287this:
288
19799a22 289 my $aref = [
6a40a726
SF
290 [ "fred", "barney", "pebbles", "bambam", "dino", ],
291 [ "homer", "bart", "marge", "maggie", ],
292 [ "george", "jane", "elroy", "judy", ],
4633a7c4
LW
293 ];
294
19799a22 295 print $aref[2][2];
4633a7c4
LW
296
297The compiler would immediately flag that as an error I<at compile time>,
19799a22 298because you were accidentally accessing C<@aref>, an undeclared
5f05dabc 299variable, and it would thereby remind you to write instead:
4633a7c4 300
19799a22 301 print $aref->[2][2]
4633a7c4
LW
302
303=head1 DEBUGGING
d74e8afc
ITB
304X<data structure, debugging> X<complex data structure, debugging>
305X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
306X<array of arrays, debugging> X<hash of arrays, debugging>
307X<array of hashes, debugging> X<hash of hashes, debugging>
4633a7c4 308
cb1e035e
BF
309You can use the debugger's C<x> command to dump out complex data structures.
310For example, given the assignment to $AoA above, here's the debugger output:
4633a7c4 311
19799a22
GS
312 DB<1> x $AoA
313 $AoA = ARRAY(0x13b5a0)
4633a7c4 314 0 ARRAY(0x1f0a24)
6a40a726
SF
315 0 'fred'
316 1 'barney'
317 2 'pebbles'
318 3 'bambam'
319 4 'dino'
4633a7c4 320 1 ARRAY(0x13b558)
6a40a726
SF
321 0 'homer'
322 1 'bart'
323 2 'marge'
324 3 'maggie'
4633a7c4 325 2 ARRAY(0x13b540)
6a40a726
SF
326 0 'george'
327 1 'jane'
328 2 'elroy'
329 3 'judy'
4633a7c4 330
cb1a09d0
AD
331=head1 CODE EXAMPLES
332
54310121 333Presented with little comment (these will get their own manpages someday)
4973169d 334here are short code examples illustrating access of various
cb1a09d0
AD
335types of data structures.
336
19799a22 337=head1 ARRAYS OF ARRAYS
d74e8afc 338X<array of arrays> X<AoA>
cb1a09d0 339
d1be9408 340=head2 Declaration of an ARRAY OF ARRAYS
cb1a09d0 341
84f709e7
JH
342 @AoA = (
343 [ "fred", "barney" ],
344 [ "george", "jane", "elroy" ],
345 [ "homer", "marge", "bart" ],
cb1a09d0
AD
346 );
347
d1be9408 348=head2 Generation of an ARRAY OF ARRAYS
cb1a09d0
AD
349
350 # reading from file
351 while ( <> ) {
19799a22 352 push @AoA, [ split ];
4973169d 353 }
cb1a09d0
AD
354
355 # calling a function
84f709e7 356 for $i ( 1 .. 10 ) {
19799a22 357 $AoA[$i] = [ somefunc($i) ];
4973169d 358 }
cb1a09d0
AD
359
360 # using temp vars
84f709e7
JH
361 for $i ( 1 .. 10 ) {
362 @tmp = somefunc($i);
363 $AoA[$i] = [ @tmp ];
4973169d 364 }
cb1a09d0
AD
365
366 # add to an existing row
84f709e7 367 push @{ $AoA[0] }, "wilma", "betty";
cb1a09d0 368
d1be9408 369=head2 Access and Printing of an ARRAY OF ARRAYS
cb1a09d0
AD
370
371 # one element
84f709e7 372 $AoA[0][0] = "Fred";
cb1a09d0
AD
373
374 # another element
19799a22 375 $AoA[1][1] =~ s/(\w)/\u$1/;
cb1a09d0
AD
376
377 # print the whole thing with refs
84f709e7 378 for $aref ( @AoA ) {
cb1a09d0 379 print "\t [ @$aref ],\n";
4973169d 380 }
cb1a09d0
AD
381
382 # print the whole thing with indices
84f709e7
JH
383 for $i ( 0 .. $#AoA ) {
384 print "\t [ @{$AoA[$i]} ],\n";
4973169d 385 }
cb1a09d0
AD
386
387 # print the whole thing one at a time
84f709e7
JH
388 for $i ( 0 .. $#AoA ) {
389 for $j ( 0 .. $#{ $AoA[$i] } ) {
390 print "elt $i $j is $AoA[$i][$j]\n";
cb1a09d0 391 }
4973169d 392 }
cb1a09d0 393
19799a22 394=head1 HASHES OF ARRAYS
d74e8afc 395X<hash of arrays> X<HoA>
cb1a09d0 396
19799a22 397=head2 Declaration of a HASH OF ARRAYS
cb1a09d0 398
84f709e7
JH
399 %HoA = (
400 flintstones => [ "fred", "barney" ],
401 jetsons => [ "george", "jane", "elroy" ],
402 simpsons => [ "homer", "marge", "bart" ],
cb1a09d0
AD
403 );
404
19799a22 405=head2 Generation of a HASH OF ARRAYS
cb1a09d0
AD
406
407 # reading from file
408 # flintstones: fred barney wilma dino
409 while ( <> ) {
84f709e7 410 next unless s/^(.*?):\s*//;
19799a22 411 $HoA{$1} = [ split ];
4973169d 412 }
cb1a09d0
AD
413
414 # reading from file; more temps
415 # flintstones: fred barney wilma dino
84f709e7
JH
416 while ( $line = <> ) {
417 ($who, $rest) = split /:\s*/, $line, 2;
418 @fields = split ' ', $rest;
419 $HoA{$who} = [ @fields ];
4973169d 420 }
cb1a09d0
AD
421
422 # calling a function that returns a list
84f709e7 423 for $group ( "simpsons", "jetsons", "flintstones" ) {
19799a22 424 $HoA{$group} = [ get_family($group) ];
4973169d 425 }
cb1a09d0
AD
426
427 # likewise, but using temps
84f709e7
JH
428 for $group ( "simpsons", "jetsons", "flintstones" ) {
429 @members = get_family($group);
430 $HoA{$group} = [ @members ];
4973169d 431 }
cb1a09d0
AD
432
433 # append new members to an existing family
84f709e7 434 push @{ $HoA{"flintstones"} }, "wilma", "betty";
cb1a09d0 435
19799a22 436=head2 Access and Printing of a HASH OF ARRAYS
cb1a09d0
AD
437
438 # one element
84f709e7 439 $HoA{flintstones}[0] = "Fred";
cb1a09d0
AD
440
441 # another element
19799a22 442 $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
cb1a09d0
AD
443
444 # print the whole thing
84f709e7
JH
445 foreach $family ( keys %HoA ) {
446 print "$family: @{ $HoA{$family} }\n"
4973169d 447 }
cb1a09d0
AD
448
449 # print the whole thing with indices
84f709e7
JH
450 foreach $family ( keys %HoA ) {
451 print "family: ";
452 foreach $i ( 0 .. $#{ $HoA{$family} } ) {
19799a22 453 print " $i = $HoA{$family}[$i]";
cb1a09d0
AD
454 }
455 print "\n";
4973169d 456 }
cb1a09d0
AD
457
458 # print the whole thing sorted by number of members
84f709e7 459 foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
19799a22 460 print "$family: @{ $HoA{$family} }\n"
4973169d 461 }
cb1a09d0
AD
462
463 # print the whole thing sorted by number of members and name
84f709e7 464 foreach $family ( sort {
6a40a726
SF
465 @{$HoA{$b}} <=> @{$HoA{$a}}
466 ||
467 $a cmp $b
468 } keys %HoA )
84f709e7 469 {
19799a22 470 print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
4973169d 471 }
cb1a09d0 472
19799a22 473=head1 ARRAYS OF HASHES
d74e8afc 474X<array of hashes> X<AoH>
cb1a09d0 475
d1be9408 476=head2 Declaration of an ARRAY OF HASHES
cb1a09d0 477
84f709e7 478 @AoH = (
cb1a09d0 479 {
84f709e7
JH
480 Lead => "fred",
481 Friend => "barney",
cb1a09d0
AD
482 },
483 {
84f709e7
JH
484 Lead => "george",
485 Wife => "jane",
486 Son => "elroy",
cb1a09d0
AD
487 },
488 {
84f709e7
JH
489 Lead => "homer",
490 Wife => "marge",
491 Son => "bart",
cb1a09d0
AD
492 }
493 );
494
d1be9408 495=head2 Generation of an ARRAY OF HASHES
cb1a09d0
AD
496
497 # reading from file
498 # format: LEAD=fred FRIEND=barney
499 while ( <> ) {
84f709e7
JH
500 $rec = {};
501 for $field ( split ) {
502 ($key, $value) = split /=/, $field;
503 $rec->{$key} = $value;
cb1a09d0 504 }
19799a22 505 push @AoH, $rec;
4973169d 506 }
cb1a09d0
AD
507
508
509 # reading from file
510 # format: LEAD=fred FRIEND=barney
511 # no temp
512 while ( <> ) {
19799a22 513 push @AoH, { split /[\s+=]/ };
4973169d 514 }
cb1a09d0 515
19799a22 516 # calling a function that returns a key/value pair list, like
84f709e7
JH
517 # "lead","fred","daughter","pebbles"
518 while ( %fields = getnextpairset() ) {
19799a22 519 push @AoH, { %fields };
4973169d 520 }
cb1a09d0
AD
521
522 # likewise, but using no temp vars
523 while (<>) {
19799a22 524 push @AoH, { parsepairs($_) };
4973169d 525 }
cb1a09d0
AD
526
527 # add key/value to an element
84f709e7 528 $AoH[0]{pet} = "dino";
19799a22 529 $AoH[2]{pet} = "santa's little helper";
cb1a09d0 530
d1be9408 531=head2 Access and Printing of an ARRAY OF HASHES
cb1a09d0
AD
532
533 # one element
84f709e7 534 $AoH[0]{lead} = "fred";
cb1a09d0
AD
535
536 # another element
19799a22 537 $AoH[1]{lead} =~ s/(\w)/\u$1/;
cb1a09d0
AD
538
539 # print the whole thing with refs
84f709e7
JH
540 for $href ( @AoH ) {
541 print "{ ";
542 for $role ( keys %$href ) {
543 print "$role=$href->{$role} ";
cb1a09d0
AD
544 }
545 print "}\n";
4973169d 546 }
cb1a09d0
AD
547
548 # print the whole thing with indices
84f709e7 549 for $i ( 0 .. $#AoH ) {
cb1a09d0 550 print "$i is { ";
84f709e7
JH
551 for $role ( keys %{ $AoH[$i] } ) {
552 print "$role=$AoH[$i]{$role} ";
cb1a09d0
AD
553 }
554 print "}\n";
4973169d 555 }
cb1a09d0
AD
556
557 # print the whole thing one at a time
84f709e7
JH
558 for $i ( 0 .. $#AoH ) {
559 for $role ( keys %{ $AoH[$i] } ) {
560 print "elt $i $role is $AoH[$i]{$role}\n";
cb1a09d0 561 }
4973169d 562 }
cb1a09d0
AD
563
564=head1 HASHES OF HASHES
8e0aa7ce 565X<hash of hashes> X<HoH>
cb1a09d0
AD
566
567=head2 Declaration of a HASH OF HASHES
568
84f709e7 569 %HoH = (
28757baa 570 flintstones => {
6a40a726
SF
571 lead => "fred",
572 pal => "barney",
cb1a09d0 573 },
28757baa 574 jetsons => {
6a40a726
SF
575 lead => "george",
576 wife => "jane",
577 "his boy" => "elroy",
4973169d 578 },
28757baa 579 simpsons => {
6a40a726
SF
580 lead => "homer",
581 wife => "marge",
582 kid => "bart",
583 },
4973169d 584 );
cb1a09d0
AD
585
586=head2 Generation of a HASH OF HASHES
587
588 # reading from file
589 # flintstones: lead=fred pal=barney wife=wilma pet=dino
590 while ( <> ) {
84f709e7
JH
591 next unless s/^(.*?):\s*//;
592 $who = $1;
593 for $field ( split ) {
594 ($key, $value) = split /=/, $field;
cb1a09d0
AD
595 $HoH{$who}{$key} = $value;
596 }
597
598
599 # reading from file; more temps
600 while ( <> ) {
84f709e7
JH
601 next unless s/^(.*?):\s*//;
602 $who = $1;
603 $rec = {};
cb1a09d0 604 $HoH{$who} = $rec;
84f709e7
JH
605 for $field ( split ) {
606 ($key, $value) = split /=/, $field;
607 $rec->{$key} = $value;
cb1a09d0 608 }
4973169d 609 }
cb1a09d0 610
cb1a09d0 611 # calling a function that returns a key,value hash
84f709e7 612 for $group ( "simpsons", "jetsons", "flintstones" ) {
cb1a09d0 613 $HoH{$group} = { get_family($group) };
4973169d 614 }
cb1a09d0
AD
615
616 # likewise, but using temps
84f709e7
JH
617 for $group ( "simpsons", "jetsons", "flintstones" ) {
618 %members = get_family($group);
cb1a09d0 619 $HoH{$group} = { %members };
4973169d 620 }
cb1a09d0
AD
621
622 # append new members to an existing family
84f709e7
JH
623 %new_folks = (
624 wife => "wilma",
625 pet => "dino",
cb1a09d0 626 );
4973169d 627
84f709e7 628 for $what (keys %new_folks) {
cb1a09d0 629 $HoH{flintstones}{$what} = $new_folks{$what};
4973169d 630 }
cb1a09d0
AD
631
632=head2 Access and Printing of a HASH OF HASHES
633
634 # one element
84f709e7 635 $HoH{flintstones}{wife} = "wilma";
cb1a09d0
AD
636
637 # another element
638 $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
639
640 # print the whole thing
84f709e7 641 foreach $family ( keys %HoH ) {
1fef88e7 642 print "$family: { ";
84f709e7
JH
643 for $role ( keys %{ $HoH{$family} } ) {
644 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
645 }
646 print "}\n";
4973169d 647 }
cb1a09d0
AD
648
649 # print the whole thing somewhat sorted
84f709e7 650 foreach $family ( sort keys %HoH ) {
1fef88e7 651 print "$family: { ";
84f709e7
JH
652 for $role ( sort keys %{ $HoH{$family} } ) {
653 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
654 }
655 print "}\n";
4973169d 656 }
cb1a09d0 657
84f709e7 658
cb1a09d0 659 # print the whole thing sorted by number of members
bd45a9fb
KW
660 foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} }
661 keys %HoH )
662 {
1fef88e7 663 print "$family: { ";
84f709e7
JH
664 for $role ( sort keys %{ $HoH{$family} } ) {
665 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
666 }
667 print "}\n";
4973169d 668 }
cb1a09d0
AD
669
670 # establish a sort order (rank) for each role
84f709e7
JH
671 $i = 0;
672 for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
cb1a09d0
AD
673
674 # now print the whole thing sorted by number of members
bd45a9fb
KW
675 foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } }
676 keys %HoH )
677 {
1fef88e7 678 print "$family: { ";
cb1a09d0 679 # and print these according to rank order
bd45a9fb
KW
680 for $role ( sort { $rank{$a} <=> $rank{$b} }
681 keys %{ $HoH{$family} } )
682 {
84f709e7 683 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
684 }
685 print "}\n";
4973169d 686 }
cb1a09d0
AD
687
688
689=head1 MORE ELABORATE RECORDS
d74e8afc 690X<record> X<structure> X<struct>
cb1a09d0
AD
691
692=head2 Declaration of MORE ELABORATE RECORDS
693
694Here's a sample showing how to create and use a record whose fields are of
695many different sorts:
696
84f709e7 697 $rec = {
6a40a726
SF
698 TEXT => $string,
699 SEQUENCE => [ @old_values ],
700 LOOKUP => { %some_table },
701 THATCODE => \&some_function,
702 THISCODE => sub { $_[0] ** $_[1] },
703 HANDLE => \*STDOUT,
cb1a09d0
AD
704 };
705
4973169d 706 print $rec->{TEXT};
cb1a09d0 707
84f709e7
JH
708 print $rec->{SEQUENCE}[0];
709 $last = pop @ { $rec->{SEQUENCE} };
cb1a09d0 710
84f709e7
JH
711 print $rec->{LOOKUP}{"key"};
712 ($first_k, $first_v) = each %{ $rec->{LOOKUP} };
cb1a09d0 713
84f709e7
JH
714 $answer = $rec->{THATCODE}->($arg);
715 $answer = $rec->{THISCODE}->($arg1, $arg2);
cb1a09d0
AD
716
717 # careful of extra block braces on fh ref
4973169d 718 print { $rec->{HANDLE} } "a string\n";
cb1a09d0
AD
719
720 use FileHandle;
4973169d 721 $rec->{HANDLE}->autoflush(1);
722 $rec->{HANDLE}->print(" a string\n");
cb1a09d0
AD
723
724=head2 Declaration of a HASH OF COMPLEX RECORDS
725
84f709e7 726 %TV = (
28757baa 727 flintstones => {
84f709e7 728 series => "flintstones",
4973169d 729 nights => [ qw(monday thursday friday) ],
cb1a09d0 730 members => [
84f709e7
JH
731 { name => "fred", role => "lead", age => 36, },
732 { name => "wilma", role => "wife", age => 31, },
733 { name => "pebbles", role => "kid", age => 4, },
cb1a09d0
AD
734 ],
735 },
736
28757baa 737 jetsons => {
84f709e7 738 series => "jetsons",
4973169d 739 nights => [ qw(wednesday saturday) ],
cb1a09d0 740 members => [
84f709e7
JH
741 { name => "george", role => "lead", age => 41, },
742 { name => "jane", role => "wife", age => 39, },
743 { name => "elroy", role => "kid", age => 9, },
cb1a09d0
AD
744 ],
745 },
746
28757baa 747 simpsons => {
84f709e7 748 series => "simpsons",
4973169d 749 nights => [ qw(monday) ],
cb1a09d0 750 members => [
84f709e7
JH
751 { name => "homer", role => "lead", age => 34, },
752 { name => "marge", role => "wife", age => 37, },
753 { name => "bart", role => "kid", age => 11, },
cb1a09d0
AD
754 ],
755 },
756 );
757
758=head2 Generation of a HASH OF COMPLEX RECORDS
759
84f709e7
JH
760 # reading from file
761 # this is most easily done by having the file itself be
762 # in the raw data format as shown above. perl is happy
763 # to parse complex data structures if declared as data, so
764 # sometimes it's easiest to do that
cb1a09d0 765
84f709e7
JH
766 # here's a piece by piece build up
767 $rec = {};
768 $rec->{series} = "flintstones";
cb1a09d0
AD
769 $rec->{nights} = [ find_days() ];
770
84f709e7 771 @members = ();
cb1a09d0 772 # assume this file in field=value syntax
84f709e7
JH
773 while (<>) {
774 %fields = split /[\s=]+/;
cb1a09d0
AD
775 push @members, { %fields };
776 }
777 $rec->{members} = [ @members ];
778
779 # now remember the whole thing
780 $TV{ $rec->{series} } = $rec;
781
84f709e7
JH
782 ###########################################################
783 # now, you might want to make interesting extra fields that
784 # include pointers back into the same data structure so if
785 # change one piece, it changes everywhere, like for example
786 # if you wanted a {kids} field that was a reference
787 # to an array of the kids' records without having duplicate
788 # records and thus update problems.
789 ###########################################################
790 foreach $family (keys %TV) {
791 $rec = $TV{$family}; # temp pointer
792 @kids = ();
793 for $person ( @{ $rec->{members} } ) {
794 if ($person->{role} =~ /kid|son|daughter/) {
cb1a09d0
AD
795 push @kids, $person;
796 }
797 }
798 # REMEMBER: $rec and $TV{$family} point to same data!!
799 $rec->{kids} = [ @kids ];
800 }
801
84f709e7
JH
802 # you copied the array, but the array itself contains pointers
803 # to uncopied objects. this means that if you make bart get
804 # older via
cb1a09d0
AD
805
806 $TV{simpsons}{kids}[0]{age}++;
807
84f709e7
JH
808 # then this would also change in
809 print $TV{simpsons}{members}[2]{age};
cb1a09d0 810
84f709e7
JH
811 # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
812 # both point to the same underlying anonymous hash table
6ba6f0ec 813
84f709e7
JH
814 # print the whole thing
815 foreach $family ( keys %TV ) {
816 print "the $family";
817 print " is on during @{ $TV{$family}{nights} }\n";
818 print "its members are:\n";
819 for $who ( @{ $TV{$family}{members} } ) {
cb1a09d0
AD
820 print " $who->{name} ($who->{role}), age $who->{age}\n";
821 }
84f709e7
JH
822 print "it turns out that $TV{$family}{lead} has ";
823 print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
824 print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
825 print "\n";
cb1a09d0
AD
826 }
827
c07a80fd 828=head1 Database Ties
829
830You cannot easily tie a multilevel data structure (such as a hash of
831hashes) to a dbm file. The first problem is that all but GDBM and
832Berkeley DB have size limitations, but beyond that, you also have problems
833with how references are to be represented on disk. One experimental
5f05dabc 834module that does partially attempt to address this need is the MLDBM
f102b883 835module. Check your nearest CPAN site as described in L<perlmodlib> for
c07a80fd 836source code to MLDBM.
837
4633a7c4
LW
838=head1 SEE ALSO
839
ba555bf5 840L<perlref>, L<perllol>, L<perldata>, L<perlobj>
4633a7c4
LW
841
842=head1 AUTHOR
843
9607fc9c 844Tom Christiansen <F<tchrist@perl.com>>