This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Update IO-Compress to CPAN version 2.040
[perl5.git] / pod / perldsc.pod
CommitLineData
cb1a09d0 1=head1 NAME
d74e8afc 2X<data structure> X<complex data structure> X<struct>
4633a7c4 3
cb1a09d0 4perldsc - Perl Data Structures Cookbook
4633a7c4 5
cb1a09d0 6=head1 DESCRIPTION
4633a7c4
LW
7
8The single feature most sorely lacking in the Perl programming language
9prior to its 5.0 release was complex data structures. Even without direct
10language support, some valiant programmers did manage to emulate them, but
11it was hard work and not for the faint of heart. You could occasionally
19799a22
GS
12get away with the C<$m{$AoA,$b}> notation borrowed from B<awk> in which the
13keys are actually more like a single concatenated string C<"$AoA$b">, but
4633a7c4
LW
14traversal and sorting were difficult. More desperate programmers even
15hacked Perl's internal symbol table directly, a strategy that proved hard
16to develop and maintain--to put it mildly.
17
18The 5.0 release of Perl let us have complex data structures. You
d1be9408 19may now write something like this and all of a sudden, you'd have an array
4633a7c4
LW
20with three dimensions!
21
84f709e7
JH
22 for $x (1 .. 10) {
23 for $y (1 .. 10) {
24 for $z (1 .. 10) {
25 $AoA[$x][$y][$z] =
26 $x ** $y + $z;
4633a7c4
LW
27 }
28 }
29 }
30
31Alas, however simple this may appear, underneath it's a much more
32elaborate construct than meets the eye!
33
19799a22 34How do you print it out? Why can't you say just C<print @AoA>? How do
4633a7c4 35you sort it? How can you pass it to a function or get one of these back
d1be9408 36from a function? Is it an object? Can you save it to disk to read
4633a7c4 37back later? How do you access whole rows or columns of that matrix? Do
4973169d 38all the values have to be numeric?
4633a7c4
LW
39
40As you see, it's quite easy to become confused. While some small portion
41of the blame for this can be attributed to the reference-based
42implementation, it's really more due to a lack of existing documentation with
43examples designed for the beginner.
44
5f05dabc 45This document is meant to be a detailed but understandable treatment of the
46many different sorts of data structures you might want to develop. It
47should also serve as a cookbook of examples. That way, when you need to
48create one of these complex data structures, you can just pinch, pilfer, or
49purloin a drop-in example from here.
4633a7c4
LW
50
51Let's look at each of these possible constructs in detail. There are separate
28757baa 52sections on each of the following:
4633a7c4
LW
53
54=over 5
55
56=item * arrays of arrays
57
58=item * hashes of arrays
59
60=item * arrays of hashes
61
62=item * hashes of hashes
63
64=item * more elaborate constructs
65
4633a7c4
LW
66=back
67
5a964f20
TC
68But for now, let's look at general issues common to all
69these types of data structures.
4633a7c4
LW
70
71=head1 REFERENCES
d74e8afc 72X<reference> X<dereference> X<dereferencing> X<pointer>
4633a7c4 73
1f025261
ML
74The most important thing to understand about all data structures in
75Perl--including multidimensional arrays--is that even though they might
4633a7c4 76appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
5f05dabc 77one-dimensional. They can hold only scalar values (meaning a string,
4633a7c4
LW
78number, or a reference). They cannot directly contain other arrays or
79hashes, but instead contain I<references> to other arrays or hashes.
d74e8afc 80X<multidimensional array> X<array, multidimensional>
4633a7c4 81
d1be9408 82You can't use a reference to an array or hash in quite the same way that you
5f05dabc 83would a real array or hash. For C or C++ programmers unused to
84distinguishing between arrays and pointers to the same, this can be
85confusing. If so, just think of it as the difference between a structure
86and a pointer to a structure.
4633a7c4 87
ba555bf5
TH
88You can (and should) read more about references in L<perlref>.
89Briefly, references are rather like pointers that know what they
4633a7c4 90point to. (Objects are also a kind of reference, but we won't be needing
4973169d 91them right away--if ever.) This means that when you have something which
92looks to you like an access to a two-or-more-dimensional array and/or hash,
93what's really going on is that the base type is
4633a7c4
LW
94merely a one-dimensional entity that contains references to the next
95level. It's just that you can I<use> it as though it were a
96two-dimensional one. This is actually the way almost all C
97multidimensional arrays work as well.
98
19799a22
GS
99 $array[7][12] # array of arrays
100 $array[7]{string} # array of hashes
4633a7c4
LW
101 $hash{string}[7] # hash of arrays
102 $hash{string}{'another string'} # hash of hashes
103
5f05dabc 104Now, because the top level contains only references, if you try to print
4633a7c4
LW
105out your array in with a simple print() function, you'll get something
106that doesn't look very nice, like this:
107
84f709e7 108 @AoA = ( [2, 3], [4, 5, 7], [0] );
19799a22 109 print $AoA[1][2];
4633a7c4 110 7
19799a22 111 print @AoA;
4633a7c4
LW
112 ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
113
114
115That's because Perl doesn't (ever) implicitly dereference your variables.
116If you want to get at the thing a reference is referring to, then you have
117to do this yourself using either prefix typing indicators, like
118C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
119like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.
120
121=head1 COMMON MISTAKES
122
123The two most common mistakes made in constructing something like
124an array of arrays is either accidentally counting the number of
125elements or else taking a reference to the same memory location
126repeatedly. Here's the case where you just get the count instead
127of a nested array:
128
84f709e7
JH
129 for $i (1..10) {
130 @array = somefunc($i);
131 $AoA[$i] = @array; # WRONG!
4973169d 132 }
4633a7c4 133
19799a22 134That's just the simple case of assigning an array to a scalar and getting
4633a7c4
LW
135its element count. If that's what you really and truly want, then you
136might do well to consider being a tad more explicit about it, like this:
137
84f709e7
JH
138 for $i (1..10) {
139 @array = somefunc($i);
140 $counts[$i] = scalar @array;
4973169d 141 }
4633a7c4 142
84f709e7
JH
143Here's the case of taking a reference to the same memory location
144again and again:
4633a7c4 145
84f709e7
JH
146 for $i (1..10) {
147 @array = somefunc($i);
148 $AoA[$i] = \@array; # WRONG!
149 }
150
151So, what's the big problem with that? It looks right, doesn't it?
152After all, I just told you that you need an array of references, so by
153golly, you've made me one!
154
155Unfortunately, while this is true, it's still broken. All the references
156in @AoA refer to the I<very same place>, and they will therefore all hold
157whatever was last in @array! It's similar to the problem demonstrated in
158the following C program:
159
160 #include <pwd.h>
161 main() {
162 struct passwd *getpwnam(), *rp, *dp;
163 rp = getpwnam("root");
164 dp = getpwnam("daemon");
165
166 printf("daemon name is %s\nroot name is %s\n",
167 dp->pw_name, rp->pw_name);
168 }
169
170Which will print
171
172 daemon name is daemon
173 root name is daemon
174
175The problem is that both C<rp> and C<dp> are pointers to the same location
176in memory! In C, you'd have to remember to malloc() yourself some new
177memory. In Perl, you'll want to use the array constructor C<[]> or the
178hash constructor C<{}> instead. Here's the right way to do the preceding
179broken code fragments:
d74e8afc 180X<[]> X<{}>
84f709e7
JH
181
182 for $i (1..10) {
183 @array = somefunc($i);
184 $AoA[$i] = [ @array ];
4973169d 185 }
4633a7c4
LW
186
187The square brackets make a reference to a new array with a I<copy>
84f709e7
JH
188of what's in @array at the time of the assignment. This is what
189you want.
4633a7c4
LW
190
191Note that this will produce something similar, but it's
192much harder to read:
193
84f709e7
JH
194 for $i (1..10) {
195 @array = 0 .. $i;
196 @{$AoA[$i]} = @array;
4973169d 197 }
4633a7c4
LW
198
199Is it the same? Well, maybe so--and maybe not. The subtle difference
200is that when you assign something in square brackets, you know for sure
201it's always a brand new reference with a new I<copy> of the data.
b5d81ce9 202Something else could be going on in this new case with the C<@{$AoA[$i]}>
4633a7c4 203dereference on the left-hand-side of the assignment. It all depends on
19799a22
GS
204whether C<$AoA[$i]> had been undefined to start with, or whether it
205already contained a reference. If you had already populated @AoA with
4633a7c4
LW
206references, as in
207
19799a22 208 $AoA[3] = \@another_array;
4633a7c4
LW
209
210Then the assignment with the indirection on the left-hand-side would
211use the existing reference that was already there:
212
84f709e7 213 @{$AoA[3]} = @array;
4633a7c4
LW
214
215Of course, this I<would> have the "interesting" effect of clobbering
19799a22 216@another_array. (Have you ever noticed how when a programmer says
4633a7c4
LW
217something is "interesting", that rather than meaning "intriguing",
218they're disturbingly more apt to mean that it's "annoying",
219"difficult", or both? :-)
220
5f05dabc 221So just remember always to use the array or hash constructors with C<[]>
4633a7c4 222or C<{}>, and you'll be fine, although it's not always optimally
4973169d 223efficient.
4633a7c4
LW
224
225Surprisingly, the following dangerous-looking construct will
226actually work out fine:
227
84f709e7
JH
228 for $i (1..10) {
229 my @array = somefunc($i);
230 $AoA[$i] = \@array;
4973169d 231 }
4633a7c4
LW
232
233That's because my() is more of a run-time statement than it is a
234compile-time declaration I<per se>. This means that the my() variable is
235remade afresh each time through the loop. So even though it I<looks> as
236though you stored the same variable reference each time, you actually did
237not! This is a subtle distinction that can produce more efficient code at
238the risk of misleading all but the most experienced of programmers. So I
239usually advise against teaching it to beginners. In fact, except for
240passing arguments to functions, I seldom like to see the gimme-a-reference
241operator (backslash) used much at all in code. Instead, I advise
242beginners that they (and most of the rest of us) should try to use the
243much more easily understood constructors C<[]> and C<{}> instead of
244relying upon lexical (or dynamic) scoping and hidden reference-counting to
245do the right thing behind the scenes.
246
247In summary:
248
84f709e7
JH
249 $AoA[$i] = [ @array ]; # usually best
250 $AoA[$i] = \@array; # perilous; just how my() was that array?
251 @{ $AoA[$i] } = @array; # way too tricky for most programmers
4633a7c4
LW
252
253
4973169d 254=head1 CAVEAT ON PRECEDENCE
d74e8afc 255X<dereference, precedence> X<dereferencing, precedence>
4633a7c4 256
84f709e7 257Speaking of things like C<@{$AoA[$i]}>, the following are actually the
4633a7c4 258same thing:
d74e8afc 259X<< -> >>
4633a7c4 260
19799a22
GS
261 $aref->[2][2] # clear
262 $$aref[2][2] # confusing
4633a7c4
LW
263
264That's because Perl's precedence rules on its five prefix dereferencers
265(which look like someone swearing: C<$ @ * % &>) make them bind more
266tightly than the postfix subscripting brackets or braces! This will no
267doubt come as a great shock to the C or C++ programmer, who is quite
268accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
269element of C<a>. That is, they first take the subscript, and only then
270dereference the thing at that subscript. That's fine in C, but this isn't C.
271
19799a22
GS
272The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
273the deref of $aref, making it take $aref as a reference to an
4633a7c4 274array, and then dereference that, and finally tell you the I<i'th> value
19799a22
GS
275of the array pointed to by $AoA. If you wanted the C notion, you'd have to
276write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
4633a7c4
LW
277before the leading C<$> dereferencer.
278
279=head1 WHY YOU SHOULD ALWAYS C<use strict>
280
281If this is starting to sound scarier than it's worth, relax. Perl has
282some features to help you avoid its most common pitfalls. The best
283way to avoid getting confused is to start every program like this:
284
285 #!/usr/bin/perl -w
286 use strict;
287
288This way, you'll be forced to declare all your variables with my() and
289also disallow accidental "symbolic dereferencing". Therefore if you'd done
290this:
291
19799a22 292 my $aref = [
84f709e7
JH
293 [ "fred", "barney", "pebbles", "bambam", "dino", ],
294 [ "homer", "bart", "marge", "maggie", ],
295 [ "george", "jane", "elroy", "judy", ],
4633a7c4
LW
296 ];
297
19799a22 298 print $aref[2][2];
4633a7c4
LW
299
300The compiler would immediately flag that as an error I<at compile time>,
19799a22 301because you were accidentally accessing C<@aref>, an undeclared
5f05dabc 302variable, and it would thereby remind you to write instead:
4633a7c4 303
19799a22 304 print $aref->[2][2]
4633a7c4
LW
305
306=head1 DEBUGGING
d74e8afc
ITB
307X<data structure, debugging> X<complex data structure, debugging>
308X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
309X<array of arrays, debugging> X<hash of arrays, debugging>
310X<array of hashes, debugging> X<hash of hashes, debugging>
4633a7c4 311
a6006777 312Before version 5.002, the standard Perl debugger didn't do a very nice job of
313printing out complex data structures. With 5.002 or above, the
4973169d 314debugger includes several new features, including command line editing as
315well as the C<x> command to dump out complex data structures. For
19799a22 316example, given the assignment to $AoA above, here's the debugger output:
4633a7c4 317
19799a22
GS
318 DB<1> x $AoA
319 $AoA = ARRAY(0x13b5a0)
4633a7c4
LW
320 0 ARRAY(0x1f0a24)
321 0 'fred'
322 1 'barney'
323 2 'pebbles'
324 3 'bambam'
325 4 'dino'
326 1 ARRAY(0x13b558)
327 0 'homer'
328 1 'bart'
329 2 'marge'
330 3 'maggie'
331 2 ARRAY(0x13b540)
332 0 'george'
333 1 'jane'
5f05dabc 334 2 'elroy'
4633a7c4
LW
335 3 'judy'
336
cb1a09d0
AD
337=head1 CODE EXAMPLES
338
54310121 339Presented with little comment (these will get their own manpages someday)
4973169d 340here are short code examples illustrating access of various
cb1a09d0
AD
341types of data structures.
342
19799a22 343=head1 ARRAYS OF ARRAYS
d74e8afc 344X<array of arrays> X<AoA>
cb1a09d0 345
d1be9408 346=head2 Declaration of an ARRAY OF ARRAYS
cb1a09d0 347
84f709e7
JH
348 @AoA = (
349 [ "fred", "barney" ],
350 [ "george", "jane", "elroy" ],
351 [ "homer", "marge", "bart" ],
cb1a09d0
AD
352 );
353
d1be9408 354=head2 Generation of an ARRAY OF ARRAYS
cb1a09d0
AD
355
356 # reading from file
357 while ( <> ) {
19799a22 358 push @AoA, [ split ];
4973169d 359 }
cb1a09d0
AD
360
361 # calling a function
84f709e7 362 for $i ( 1 .. 10 ) {
19799a22 363 $AoA[$i] = [ somefunc($i) ];
4973169d 364 }
cb1a09d0
AD
365
366 # using temp vars
84f709e7
JH
367 for $i ( 1 .. 10 ) {
368 @tmp = somefunc($i);
369 $AoA[$i] = [ @tmp ];
4973169d 370 }
cb1a09d0
AD
371
372 # add to an existing row
84f709e7 373 push @{ $AoA[0] }, "wilma", "betty";
cb1a09d0 374
d1be9408 375=head2 Access and Printing of an ARRAY OF ARRAYS
cb1a09d0
AD
376
377 # one element
84f709e7 378 $AoA[0][0] = "Fred";
cb1a09d0
AD
379
380 # another element
19799a22 381 $AoA[1][1] =~ s/(\w)/\u$1/;
cb1a09d0
AD
382
383 # print the whole thing with refs
84f709e7 384 for $aref ( @AoA ) {
cb1a09d0 385 print "\t [ @$aref ],\n";
4973169d 386 }
cb1a09d0
AD
387
388 # print the whole thing with indices
84f709e7
JH
389 for $i ( 0 .. $#AoA ) {
390 print "\t [ @{$AoA[$i]} ],\n";
4973169d 391 }
cb1a09d0
AD
392
393 # print the whole thing one at a time
84f709e7
JH
394 for $i ( 0 .. $#AoA ) {
395 for $j ( 0 .. $#{ $AoA[$i] } ) {
396 print "elt $i $j is $AoA[$i][$j]\n";
cb1a09d0 397 }
4973169d 398 }
cb1a09d0 399
19799a22 400=head1 HASHES OF ARRAYS
d74e8afc 401X<hash of arrays> X<HoA>
cb1a09d0 402
19799a22 403=head2 Declaration of a HASH OF ARRAYS
cb1a09d0 404
84f709e7
JH
405 %HoA = (
406 flintstones => [ "fred", "barney" ],
407 jetsons => [ "george", "jane", "elroy" ],
408 simpsons => [ "homer", "marge", "bart" ],
cb1a09d0
AD
409 );
410
19799a22 411=head2 Generation of a HASH OF ARRAYS
cb1a09d0
AD
412
413 # reading from file
414 # flintstones: fred barney wilma dino
415 while ( <> ) {
84f709e7 416 next unless s/^(.*?):\s*//;
19799a22 417 $HoA{$1} = [ split ];
4973169d 418 }
cb1a09d0
AD
419
420 # reading from file; more temps
421 # flintstones: fred barney wilma dino
84f709e7
JH
422 while ( $line = <> ) {
423 ($who, $rest) = split /:\s*/, $line, 2;
424 @fields = split ' ', $rest;
425 $HoA{$who} = [ @fields ];
4973169d 426 }
cb1a09d0
AD
427
428 # calling a function that returns a list
84f709e7 429 for $group ( "simpsons", "jetsons", "flintstones" ) {
19799a22 430 $HoA{$group} = [ get_family($group) ];
4973169d 431 }
cb1a09d0
AD
432
433 # likewise, but using temps
84f709e7
JH
434 for $group ( "simpsons", "jetsons", "flintstones" ) {
435 @members = get_family($group);
436 $HoA{$group} = [ @members ];
4973169d 437 }
cb1a09d0
AD
438
439 # append new members to an existing family
84f709e7 440 push @{ $HoA{"flintstones"} }, "wilma", "betty";
cb1a09d0 441
19799a22 442=head2 Access and Printing of a HASH OF ARRAYS
cb1a09d0
AD
443
444 # one element
84f709e7 445 $HoA{flintstones}[0] = "Fred";
cb1a09d0
AD
446
447 # another element
19799a22 448 $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
cb1a09d0
AD
449
450 # print the whole thing
84f709e7
JH
451 foreach $family ( keys %HoA ) {
452 print "$family: @{ $HoA{$family} }\n"
4973169d 453 }
cb1a09d0
AD
454
455 # print the whole thing with indices
84f709e7
JH
456 foreach $family ( keys %HoA ) {
457 print "family: ";
458 foreach $i ( 0 .. $#{ $HoA{$family} } ) {
19799a22 459 print " $i = $HoA{$family}[$i]";
cb1a09d0
AD
460 }
461 print "\n";
4973169d 462 }
cb1a09d0
AD
463
464 # print the whole thing sorted by number of members
84f709e7 465 foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
19799a22 466 print "$family: @{ $HoA{$family} }\n"
4973169d 467 }
cb1a09d0
AD
468
469 # print the whole thing sorted by number of members and name
84f709e7
JH
470 foreach $family ( sort {
471 @{$HoA{$b}} <=> @{$HoA{$a}}
472 ||
473 $a cmp $b
474 } keys %HoA )
475 {
19799a22 476 print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
4973169d 477 }
cb1a09d0 478
19799a22 479=head1 ARRAYS OF HASHES
d74e8afc 480X<array of hashes> X<AoH>
cb1a09d0 481
d1be9408 482=head2 Declaration of an ARRAY OF HASHES
cb1a09d0 483
84f709e7 484 @AoH = (
cb1a09d0 485 {
84f709e7
JH
486 Lead => "fred",
487 Friend => "barney",
cb1a09d0
AD
488 },
489 {
84f709e7
JH
490 Lead => "george",
491 Wife => "jane",
492 Son => "elroy",
cb1a09d0
AD
493 },
494 {
84f709e7
JH
495 Lead => "homer",
496 Wife => "marge",
497 Son => "bart",
cb1a09d0
AD
498 }
499 );
500
d1be9408 501=head2 Generation of an ARRAY OF HASHES
cb1a09d0
AD
502
503 # reading from file
504 # format: LEAD=fred FRIEND=barney
505 while ( <> ) {
84f709e7
JH
506 $rec = {};
507 for $field ( split ) {
508 ($key, $value) = split /=/, $field;
509 $rec->{$key} = $value;
cb1a09d0 510 }
19799a22 511 push @AoH, $rec;
4973169d 512 }
cb1a09d0
AD
513
514
515 # reading from file
516 # format: LEAD=fred FRIEND=barney
517 # no temp
518 while ( <> ) {
19799a22 519 push @AoH, { split /[\s+=]/ };
4973169d 520 }
cb1a09d0 521
19799a22 522 # calling a function that returns a key/value pair list, like
84f709e7
JH
523 # "lead","fred","daughter","pebbles"
524 while ( %fields = getnextpairset() ) {
19799a22 525 push @AoH, { %fields };
4973169d 526 }
cb1a09d0
AD
527
528 # likewise, but using no temp vars
529 while (<>) {
19799a22 530 push @AoH, { parsepairs($_) };
4973169d 531 }
cb1a09d0
AD
532
533 # add key/value to an element
84f709e7 534 $AoH[0]{pet} = "dino";
19799a22 535 $AoH[2]{pet} = "santa's little helper";
cb1a09d0 536
d1be9408 537=head2 Access and Printing of an ARRAY OF HASHES
cb1a09d0
AD
538
539 # one element
84f709e7 540 $AoH[0]{lead} = "fred";
cb1a09d0
AD
541
542 # another element
19799a22 543 $AoH[1]{lead} =~ s/(\w)/\u$1/;
cb1a09d0
AD
544
545 # print the whole thing with refs
84f709e7
JH
546 for $href ( @AoH ) {
547 print "{ ";
548 for $role ( keys %$href ) {
549 print "$role=$href->{$role} ";
cb1a09d0
AD
550 }
551 print "}\n";
4973169d 552 }
cb1a09d0
AD
553
554 # print the whole thing with indices
84f709e7 555 for $i ( 0 .. $#AoH ) {
cb1a09d0 556 print "$i is { ";
84f709e7
JH
557 for $role ( keys %{ $AoH[$i] } ) {
558 print "$role=$AoH[$i]{$role} ";
cb1a09d0
AD
559 }
560 print "}\n";
4973169d 561 }
cb1a09d0
AD
562
563 # print the whole thing one at a time
84f709e7
JH
564 for $i ( 0 .. $#AoH ) {
565 for $role ( keys %{ $AoH[$i] } ) {
566 print "elt $i $role is $AoH[$i]{$role}\n";
cb1a09d0 567 }
4973169d 568 }
cb1a09d0
AD
569
570=head1 HASHES OF HASHES
8e0aa7ce 571X<hash of hashes> X<HoH>
cb1a09d0
AD
572
573=head2 Declaration of a HASH OF HASHES
574
84f709e7 575 %HoH = (
28757baa 576 flintstones => {
84f709e7
JH
577 lead => "fred",
578 pal => "barney",
cb1a09d0 579 },
28757baa 580 jetsons => {
84f709e7
JH
581 lead => "george",
582 wife => "jane",
583 "his boy" => "elroy",
4973169d 584 },
28757baa 585 simpsons => {
84f709e7
JH
586 lead => "homer",
587 wife => "marge",
588 kid => "bart",
4973169d 589 },
590 );
cb1a09d0
AD
591
592=head2 Generation of a HASH OF HASHES
593
594 # reading from file
595 # flintstones: lead=fred pal=barney wife=wilma pet=dino
596 while ( <> ) {
84f709e7
JH
597 next unless s/^(.*?):\s*//;
598 $who = $1;
599 for $field ( split ) {
600 ($key, $value) = split /=/, $field;
cb1a09d0
AD
601 $HoH{$who}{$key} = $value;
602 }
603
604
605 # reading from file; more temps
606 while ( <> ) {
84f709e7
JH
607 next unless s/^(.*?):\s*//;
608 $who = $1;
609 $rec = {};
cb1a09d0 610 $HoH{$who} = $rec;
84f709e7
JH
611 for $field ( split ) {
612 ($key, $value) = split /=/, $field;
613 $rec->{$key} = $value;
cb1a09d0 614 }
4973169d 615 }
cb1a09d0 616
cb1a09d0 617 # calling a function that returns a key,value hash
84f709e7 618 for $group ( "simpsons", "jetsons", "flintstones" ) {
cb1a09d0 619 $HoH{$group} = { get_family($group) };
4973169d 620 }
cb1a09d0
AD
621
622 # likewise, but using temps
84f709e7
JH
623 for $group ( "simpsons", "jetsons", "flintstones" ) {
624 %members = get_family($group);
cb1a09d0 625 $HoH{$group} = { %members };
4973169d 626 }
cb1a09d0
AD
627
628 # append new members to an existing family
84f709e7
JH
629 %new_folks = (
630 wife => "wilma",
631 pet => "dino",
cb1a09d0 632 );
4973169d 633
84f709e7 634 for $what (keys %new_folks) {
cb1a09d0 635 $HoH{flintstones}{$what} = $new_folks{$what};
4973169d 636 }
cb1a09d0
AD
637
638=head2 Access and Printing of a HASH OF HASHES
639
640 # one element
84f709e7 641 $HoH{flintstones}{wife} = "wilma";
cb1a09d0
AD
642
643 # another element
644 $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
645
646 # print the whole thing
84f709e7 647 foreach $family ( keys %HoH ) {
1fef88e7 648 print "$family: { ";
84f709e7
JH
649 for $role ( keys %{ $HoH{$family} } ) {
650 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
651 }
652 print "}\n";
4973169d 653 }
cb1a09d0
AD
654
655 # print the whole thing somewhat sorted
84f709e7 656 foreach $family ( sort keys %HoH ) {
1fef88e7 657 print "$family: { ";
84f709e7
JH
658 for $role ( sort keys %{ $HoH{$family} } ) {
659 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
660 }
661 print "}\n";
4973169d 662 }
cb1a09d0 663
84f709e7 664
cb1a09d0 665 # print the whole thing sorted by number of members
84f709e7 666 foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) {
1fef88e7 667 print "$family: { ";
84f709e7
JH
668 for $role ( sort keys %{ $HoH{$family} } ) {
669 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
670 }
671 print "}\n";
4973169d 672 }
cb1a09d0
AD
673
674 # establish a sort order (rank) for each role
84f709e7
JH
675 $i = 0;
676 for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
cb1a09d0
AD
677
678 # now print the whole thing sorted by number of members
84f709e7 679 foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) {
1fef88e7 680 print "$family: { ";
cb1a09d0 681 # and print these according to rank order
84f709e7
JH
682 for $role ( sort { $rank{$a} <=> $rank{$b} } keys %{ $HoH{$family} } ) {
683 print "$role=$HoH{$family}{$role} ";
cb1a09d0
AD
684 }
685 print "}\n";
4973169d 686 }
cb1a09d0
AD
687
688
689=head1 MORE ELABORATE RECORDS
d74e8afc 690X<record> X<structure> X<struct>
cb1a09d0
AD
691
692=head2 Declaration of MORE ELABORATE RECORDS
693
694Here's a sample showing how to create and use a record whose fields are of
695many different sorts:
696
84f709e7 697 $rec = {
4973169d 698 TEXT => $string,
699 SEQUENCE => [ @old_values ],
700 LOOKUP => { %some_table },
701 THATCODE => \&some_function,
702 THISCODE => sub { $_[0] ** $_[1] },
703 HANDLE => \*STDOUT,
cb1a09d0
AD
704 };
705
4973169d 706 print $rec->{TEXT};
cb1a09d0 707
84f709e7
JH
708 print $rec->{SEQUENCE}[0];
709 $last = pop @ { $rec->{SEQUENCE} };
cb1a09d0 710
84f709e7
JH
711 print $rec->{LOOKUP}{"key"};
712 ($first_k, $first_v) = each %{ $rec->{LOOKUP} };
cb1a09d0 713
84f709e7
JH
714 $answer = $rec->{THATCODE}->($arg);
715 $answer = $rec->{THISCODE}->($arg1, $arg2);
cb1a09d0
AD
716
717 # careful of extra block braces on fh ref
4973169d 718 print { $rec->{HANDLE} } "a string\n";
cb1a09d0
AD
719
720 use FileHandle;
4973169d 721 $rec->{HANDLE}->autoflush(1);
722 $rec->{HANDLE}->print(" a string\n");
cb1a09d0
AD
723
724=head2 Declaration of a HASH OF COMPLEX RECORDS
725
84f709e7 726 %TV = (
28757baa 727 flintstones => {
84f709e7 728 series => "flintstones",
4973169d 729 nights => [ qw(monday thursday friday) ],
cb1a09d0 730 members => [
84f709e7
JH
731 { name => "fred", role => "lead", age => 36, },
732 { name => "wilma", role => "wife", age => 31, },
733 { name => "pebbles", role => "kid", age => 4, },
cb1a09d0
AD
734 ],
735 },
736
28757baa 737 jetsons => {
84f709e7 738 series => "jetsons",
4973169d 739 nights => [ qw(wednesday saturday) ],
cb1a09d0 740 members => [
84f709e7
JH
741 { name => "george", role => "lead", age => 41, },
742 { name => "jane", role => "wife", age => 39, },
743 { name => "elroy", role => "kid", age => 9, },
cb1a09d0
AD
744 ],
745 },
746
28757baa 747 simpsons => {
84f709e7 748 series => "simpsons",
4973169d 749 nights => [ qw(monday) ],
cb1a09d0 750 members => [
84f709e7
JH
751 { name => "homer", role => "lead", age => 34, },
752 { name => "marge", role => "wife", age => 37, },
753 { name => "bart", role => "kid", age => 11, },
cb1a09d0
AD
754 ],
755 },
756 );
757
758=head2 Generation of a HASH OF COMPLEX RECORDS
759
84f709e7
JH
760 # reading from file
761 # this is most easily done by having the file itself be
762 # in the raw data format as shown above. perl is happy
763 # to parse complex data structures if declared as data, so
764 # sometimes it's easiest to do that
cb1a09d0 765
84f709e7
JH
766 # here's a piece by piece build up
767 $rec = {};
768 $rec->{series} = "flintstones";
cb1a09d0
AD
769 $rec->{nights} = [ find_days() ];
770
84f709e7 771 @members = ();
cb1a09d0 772 # assume this file in field=value syntax
84f709e7
JH
773 while (<>) {
774 %fields = split /[\s=]+/;
cb1a09d0
AD
775 push @members, { %fields };
776 }
777 $rec->{members} = [ @members ];
778
779 # now remember the whole thing
780 $TV{ $rec->{series} } = $rec;
781
84f709e7
JH
782 ###########################################################
783 # now, you might want to make interesting extra fields that
784 # include pointers back into the same data structure so if
785 # change one piece, it changes everywhere, like for example
786 # if you wanted a {kids} field that was a reference
787 # to an array of the kids' records without having duplicate
788 # records and thus update problems.
789 ###########################################################
790 foreach $family (keys %TV) {
791 $rec = $TV{$family}; # temp pointer
792 @kids = ();
793 for $person ( @{ $rec->{members} } ) {
794 if ($person->{role} =~ /kid|son|daughter/) {
cb1a09d0
AD
795 push @kids, $person;
796 }
797 }
798 # REMEMBER: $rec and $TV{$family} point to same data!!
799 $rec->{kids} = [ @kids ];
800 }
801
84f709e7
JH
802 # you copied the array, but the array itself contains pointers
803 # to uncopied objects. this means that if you make bart get
804 # older via
cb1a09d0
AD
805
806 $TV{simpsons}{kids}[0]{age}++;
807
84f709e7
JH
808 # then this would also change in
809 print $TV{simpsons}{members}[2]{age};
cb1a09d0 810
84f709e7
JH
811 # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
812 # both point to the same underlying anonymous hash table
6ba6f0ec 813
84f709e7
JH
814 # print the whole thing
815 foreach $family ( keys %TV ) {
816 print "the $family";
817 print " is on during @{ $TV{$family}{nights} }\n";
818 print "its members are:\n";
819 for $who ( @{ $TV{$family}{members} } ) {
cb1a09d0
AD
820 print " $who->{name} ($who->{role}), age $who->{age}\n";
821 }
84f709e7
JH
822 print "it turns out that $TV{$family}{lead} has ";
823 print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
824 print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
825 print "\n";
cb1a09d0
AD
826 }
827
c07a80fd 828=head1 Database Ties
829
830You cannot easily tie a multilevel data structure (such as a hash of
831hashes) to a dbm file. The first problem is that all but GDBM and
832Berkeley DB have size limitations, but beyond that, you also have problems
833with how references are to be represented on disk. One experimental
5f05dabc 834module that does partially attempt to address this need is the MLDBM
f102b883 835module. Check your nearest CPAN site as described in L<perlmodlib> for
c07a80fd 836source code to MLDBM.
837
4633a7c4
LW
838=head1 SEE ALSO
839
ba555bf5 840L<perlref>, L<perllol>, L<perldata>, L<perlobj>
4633a7c4
LW
841
842=head1 AUTHOR
843
9607fc9c 844Tom Christiansen <F<tchrist@perl.com>>
4633a7c4 845
84f709e7 846Last update:
28757baa 847Wed Oct 23 04:57:50 MET DST 1996