pod/perldsc.pod

   1 =head1 NAME
   2 X<data structure> X<complex data structure> X<struct>
   3
   4 perldsc - Perl Data Structures Cookbook
   5
   6 =head1 DESCRIPTION
   7
   8 Perl lets us have complex data structures.  You can write something like
   9 this and all of a sudden, you'd have an array with three dimensions!
  10
  11     for my $x (1 .. 10) {
  12         for my $y (1 .. 10) {
  13             for my $z (1 .. 10) {
  14                 $AoA[$x][$y][$z] =
  15                     $x ** $y + $z;
  16             }
  17         }
  18     }
  19
  20 Alas, however simple this may appear, underneath it's a much more
  21 elaborate construct than meets the eye!
  22
  23 How do you print it out?  Why can't you say just C<print @AoA>?  How do
  24 you sort it?  How can you pass it to a function or get one of these back
  25 from a function?  Is it an object?  Can you save it to disk to read
  26 back later?  How do you access whole rows or columns of that matrix?  Do
  27 all the values have to be numeric?
  28
  29 As you see, it's quite easy to become confused.  While some small portion
  30 of the blame for this can be attributed to the reference-based
  31 implementation, it's really more due to a lack of existing documentation with
  32 examples designed for the beginner.
  33
  34 This document is meant to be a detailed but understandable treatment of the
  35 many different sorts of data structures you might want to develop.  It
  36 should also serve as a cookbook of examples.  That way, when you need to
  37 create one of these complex data structures, you can just pinch, pilfer, or
  38 purloin a drop-in example from here.
  39
  40 Let's look at each of these possible constructs in detail.  There are separate
  41 sections on each of the following:
  42
  43 =over 5
  44
  45 =item * arrays of arrays
  46
  47 =item * hashes of arrays
  48
  49 =item * arrays of hashes
  50
  51 =item * hashes of hashes
  52
  53 =item * more elaborate constructs
  54
  55 =back
  56
  57 But for now, let's look at general issues common to all
  58 these types of data structures.
  59
  60 =head1 REFERENCES
  61 X<reference> X<dereference> X<dereferencing> X<pointer>
  62
  63 The most important thing to understand about all data structures in
  64 Perl--including multidimensional arrays--is that even though they might
  65 appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
  66 one-dimensional.  They can hold only scalar values (meaning a string,
  67 number, or a reference).  They cannot directly contain other arrays or
  68 hashes, but instead contain I<references> to other arrays or hashes.
  69 X<multidimensional array> X<array, multidimensional>
  70
  71 You can't use a reference to an array or hash in quite the same way that you
  72 would a real array or hash.  For C or C++ programmers unused to
  73 distinguishing between arrays and pointers to the same, this can be
  74 confusing.  If so, just think of it as the difference between a structure
  75 and a pointer to a structure.
  76
  77 You can (and should) read more about references in L<perlref>.
  78 Briefly, references are rather like pointers that know what they
  79 point to.  (Objects are also a kind of reference, but we won't be needing
  80 them right away--if ever.)  This means that when you have something which
  81 looks to you like an access to a two-or-more-dimensional array and/or hash,
  82 what's really going on is that the base type is
  83 merely a one-dimensional entity that contains references to the next
  84 level.  It's just that you can I<use> it as though it were a
  85 two-dimensional one.  This is actually the way almost all C
  86 multidimensional arrays work as well.
  87
  88     $array[7][12]                       # array of arrays
  89     $array[7]{string}                   # array of hashes
  90     $hash{string}[7]                    # hash of arrays
  91     $hash{string}{'another string'}     # hash of hashes
  92
  93 Now, because the top level contains only references, if you try to print
  94 out your array in with a simple print() function, you'll get something
  95 that doesn't look very nice, like this:
  96
  97     my @AoA = ( [2, 3], [4, 5, 7], [0] );
  98     print $AoA[1][2];
  99   7
 100     print @AoA;
 101   ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)
 102
 103
 104 That's because Perl doesn't (ever) implicitly dereference your variables.
 105 If you want to get at the thing a reference is referring to, then you have
 106 to do this yourself using either prefix typing indicators, like
 107 C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
 108 like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.
 109
 110 =head1 COMMON MISTAKES
 111
 112 The two most common mistakes made in constructing something like
 113 an array of arrays is either accidentally counting the number of
 114 elements or else taking a reference to the same memory location
 115 repeatedly.  Here's the case where you just get the count instead
 116 of a nested array:
 117
 118     for my $i (1..10) {
 119         my @array = somefunc($i);
 120         $AoA[$i] = @array;      # WRONG!
 121     }
 122
 123 That's just the simple case of assigning an array to a scalar and getting
 124 its element count.  If that's what you really and truly want, then you
 125 might do well to consider being a tad more explicit about it, like this:
 126
 127     for my $i (1..10) {
 128         my @array = somefunc($i);
 129         $counts[$i] = scalar @array;
 130     }
 131
 132 Here's the case of taking a reference to the same memory location
 133 again and again:
 134
 135     # Either without strict or having an outer-scope my @array;
 136     # declaration.
 137
 138     for my $i (1..10) {
 139         @array = somefunc($i);
 140         $AoA[$i] = \@array;     # WRONG!
 141     }
 142
 143 So, what's the big problem with that?  It looks right, doesn't it?
 144 After all, I just told you that you need an array of references, so by
 145 golly, you've made me one!
 146
 147 Unfortunately, while this is true, it's still broken.  All the references
 148 in @AoA refer to the I<very same place>, and they will therefore all hold
 149 whatever was last in @array!  It's similar to the problem demonstrated in
 150 the following C program:
 151
 152     #include <pwd.h>
 153     main() {
 154         struct passwd *getpwnam(), *rp, *dp;
 155         rp = getpwnam("root");
 156         dp = getpwnam("daemon");
 157
 158         printf("daemon name is %s\nroot name is %s\n",
 159                 dp->pw_name, rp->pw_name);
 160     }
 161
 162 Which will print
 163
 164     daemon name is daemon
 165     root name is daemon
 166
 167 The problem is that both C<rp> and C<dp> are pointers to the same location
 168 in memory!  In C, you'd have to remember to malloc() yourself some new
 169 memory.  In Perl, you'll want to use the array constructor C<[]> or the
 170 hash constructor C<{}> instead.   Here's the right way to do the preceding
 171 broken code fragments:
 172 X<[]> X<{}>
 173
 174     # Either without strict or having an outer-scope my @array;
 175     # declaration.
 176
 177     for my $i (1..10) {
 178         @array = somefunc($i);
 179         $AoA[$i] = [ @array ];
 180     }
 181
 182 The square brackets make a reference to a new array with a I<copy>
 183 of what's in @array at the time of the assignment.  This is what
 184 you want.
 185
 186 Note that this will produce something similar:
 187
 188     # Either without strict or having an outer-scope my @array;
 189     # declaration.
 190     for my $i (1..10) {
 191         @array = 0 .. $i;
 192         $AoA[$i]->@* = @array;
 193     }
 194
 195 Is it the same?  Well, maybe so--and maybe not.  The subtle difference
 196 is that when you assign something in square brackets, you know for sure
 197 it's always a brand new reference with a new I<copy> of the data.
 198 Something else could be going on in this new case with the
 199 C<< $AoA[$i]->@* >> dereference on the left-hand-side of the assignment.
 200 It all depends on whether C<$AoA[$i]> had been undefined to start with,
 201 or whether it already contained a reference.  If you had already
 202 populated @AoA with references, as in
 203
 204     $AoA[3] = \@another_array;
 205
 206 Then the assignment with the indirection on the left-hand-side would
 207 use the existing reference that was already there:
 208
 209     $AoA[3]->@* = @array;
 210
 211 Of course, this I<would> have the "interesting" effect of clobbering
 212 @another_array.  (Have you ever noticed how when a programmer says
 213 something is "interesting", that rather than meaning "intriguing",
 214 they're disturbingly more apt to mean that it's "annoying",
 215 "difficult", or both?  :-)
 216
 217 So just remember always to use the array or hash constructors with C<[]>
 218 or C<{}>, and you'll be fine, although it's not always optimally
 219 efficient.
 220
 221 Surprisingly, the following dangerous-looking construct will
 222 actually work out fine:
 223
 224     for my $i (1..10) {
 225         my @array = somefunc($i);
 226         $AoA[$i] = \@array;
 227     }
 228
 229 That's because my() is more of a run-time statement than it is a
 230 compile-time declaration I<per se>.  This means that the my() variable is
 231 remade afresh each time through the loop.  So even though it I<looks> as
 232 though you stored the same variable reference each time, you actually did
 233 not!  This is a subtle distinction that can produce more efficient code at
 234 the risk of misleading all but the most experienced of programmers.  So I
 235 usually advise against teaching it to beginners.  In fact, except for
 236 passing arguments to functions, I seldom like to see the gimme-a-reference
 237 operator (backslash) used much at all in code.  Instead, I advise
 238 beginners that they (and most of the rest of us) should try to use the
 239 much more easily understood constructors C<[]> and C<{}> instead of
 240 relying upon lexical (or dynamic) scoping and hidden reference-counting to
 241 do the right thing behind the scenes.
 242
 243 Note also that there exists another way to write a dereference!  These
 244 two lines are equivalent:
 245
 246     $AoA[$i]->@* = @array;
 247     @{ $AoA[$i] } = @array;
 248
 249 The first form, called I<postfix dereference> is generally easier to
 250 read, because the expression can be read from left to right, and there
 251 are no enclosing braces to balance.  On the other hand, it is also
 252 newer.  It was added to the language in 2014, so you will often
 253 encounter the other form, I<circumfix dereference>, in older code.
 254
 255 In summary:
 256
 257     $AoA[$i] = [ @array ];     # usually best
 258     $AoA[$i] = \@array;        # perilous; just how my() was that array?
 259     $AoA[$i]->@*  = @array;    # way too tricky for most programmers
 260     @{ $AoA[$i] } = @array;    # just as tricky, and also harder to read
 261
 262 =head1 CAVEAT ON PRECEDENCE
 263 X<dereference, precedence> X<dereferencing, precedence>
 264
 265 Speaking of things like C<@{$AoA[$i]}>, the following are actually the
 266 same thing:
 267 X<< -> >>
 268
 269     $aref->[2][2]       # clear
 270     $$aref[2][2]        # confusing
 271
 272 That's because Perl's precedence rules on its five prefix dereferencers
 273 (which look like someone swearing: C<$ @ * % &>) make them bind more
 274 tightly than the postfix subscripting brackets or braces!  This will no
 275 doubt come as a great shock to the C or C++ programmer, who is quite
 276 accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
 277 element of C<a>.  That is, they first take the subscript, and only then
 278 dereference the thing at that subscript.  That's fine in C, but this isn't C.
 279
 280 The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
 281 the deref of $aref, making it take $aref as a reference to an
 282 array, and then dereference that, and finally tell you the I<i'th> value
 283 of the array pointed to by $AoA. If you wanted the C notion, you could
 284 write C<< $AoA[$i]->$* >> to explicitly dereference the I<i'th> item,
 285 reading left to right.
 286
 287 =head1 WHY YOU SHOULD ALWAYS C<use strict>
 288
 289 If this is starting to sound scarier than it's worth, relax.  Perl has
 290 some features to help you avoid its most common pitfalls.  The best
 291 way to avoid getting confused is to start every program with:
 292
 293     use strict;
 294
 295 This way, you'll be forced to declare all your variables with my() and
 296 also disallow accidental "symbolic dereferencing".  Therefore if you'd done
 297 this:
 298
 299     my $aref = [
 300         [ "fred", "barney", "pebbles", "bambam", "dino", ],
 301         [ "homer", "bart", "marge", "maggie", ],
 302         [ "george", "jane", "elroy", "judy", ],
 303     ];
 304
 305     print $aref[2][2];
 306
 307 The compiler would immediately flag that as an error I<at compile time>,
 308 because you were accidentally accessing C<@aref>, an undeclared
 309 variable, and it would thereby remind you to write instead:
 310
 311     print $aref->[2][2]
 312
 313 =head1 DEBUGGING
 314 X<data structure, debugging> X<complex data structure, debugging>
 315 X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
 316 X<array of arrays, debugging> X<hash of arrays, debugging>
 317 X<array of hashes, debugging> X<hash of hashes, debugging>
 318
 319 You can use the debugger's C<x> command to dump out complex data structures.
 320 For example, given the assignment to $AoA above, here's the debugger output:
 321
 322     DB<1> x $AoA
 323     $AoA = ARRAY(0x13b5a0)
 324        0  ARRAY(0x1f0a24)
 325           0  'fred'
 326           1  'barney'
 327           2  'pebbles'
 328           3  'bambam'
 329           4  'dino'
 330        1  ARRAY(0x13b558)
 331           0  'homer'
 332           1  'bart'
 333           2  'marge'
 334           3  'maggie'
 335        2  ARRAY(0x13b540)
 336           0  'george'
 337           1  'jane'
 338           2  'elroy'
 339           3  'judy'
 340
 341 =head1 CODE EXAMPLES
 342
 343 Presented with little comment here are short code examples illustrating
 344 access of various types of data structures.
 345
 346 =head1 ARRAYS OF ARRAYS
 347 X<array of arrays> X<AoA>
 348
 349 =head2 Declaration of an ARRAY OF ARRAYS
 350
 351  my @AoA = (
 352         [ "fred", "barney" ],
 353         [ "george", "jane", "elroy" ],
 354         [ "homer", "marge", "bart" ],
 355       );
 356
 357 =head2 Generation of an ARRAY OF ARRAYS
 358
 359  # reading from file
 360  while ( <> ) {
 361      push @AoA, [ split ];
 362  }
 363
 364  # calling a function
 365  for my $i ( 1 .. 10 ) {
 366      $AoA[$i] = [ somefunc($i) ];
 367  }
 368
 369  # using temp vars
 370  for my $i ( 1 .. 10 ) {
 371      my @tmp = somefunc($i);
 372      $AoA[$i] = [ @tmp ];
 373  }
 374
 375  # add to an existing row
 376  push $AoA[0]->@*, "wilma", "betty";
 377
 378 =head2 Access and Printing of an ARRAY OF ARRAYS
 379
 380  # one element
 381  $AoA[0][0] = "Fred";
 382
 383  # another element
 384  $AoA[1][1] =~ s/(\w)/\u$1/;
 385
 386  # print the whole thing with refs
 387  for my $aref ( @AoA ) {
 388      print "\t [ @$aref ],\n";
 389  }
 390
 391  # print the whole thing with indices
 392  for my $i ( 0 .. $#AoA ) {
 393      print "\t [ $AoA[$i]->@* ],\n";
 394  }
 395
 396  # print the whole thing one at a time
 397  for my $i ( 0 .. $#AoA ) {
 398      for my $j ( 0 .. $AoA[$i]->$#* ) {
 399          print "elem at ($i, $j) is $AoA[$i][$j]\n";
 400      }
 401  }
 402
 403 =head1 HASHES OF ARRAYS
 404 X<hash of arrays> X<HoA>
 405
 406 =head2 Declaration of a HASH OF ARRAYS
 407
 408  my %HoA = (
 409         flintstones        => [ "fred", "barney" ],
 410         jetsons            => [ "george", "jane", "elroy" ],
 411         simpsons           => [ "homer", "marge", "bart" ],
 412       );
 413
 414 =head2 Generation of a HASH OF ARRAYS
 415
 416  # reading from file
 417  # flintstones: fred barney wilma dino
 418  while ( <> ) {
 419      next unless s/^(.*?):\s*//;
 420      $HoA{$1} = [ split ];
 421  }
 422
 423  # reading from file; more temps
 424  # flintstones: fred barney wilma dino
 425  while ( my $line = <> ) {
 426      my ($who, $rest) = split /:\s*/, $line, 2;
 427      my @fields = split ' ', $rest;
 428      $HoA{$who} = [ @fields ];
 429  }
 430
 431  # calling a function that returns a list
 432  for my $group ( "simpsons", "jetsons", "flintstones" ) {
 433      $HoA{$group} = [ get_family($group) ];
 434  }
 435
 436  # likewise, but using temps
 437  for my $group ( "simpsons", "jetsons", "flintstones" ) {
 438      my @members = get_family($group);
 439      $HoA{$group} = [ @members ];
 440  }
 441
 442  # append new members to an existing family
 443  push $HoA{flintstones}->@*, "wilma", "betty";
 444
 445 =head2 Access and Printing of a HASH OF ARRAYS
 446
 447  # one element
 448  $HoA{flintstones}[0] = "Fred";
 449
 450  # another element
 451  $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
 452
 453  # print the whole thing
 454  foreach my $family ( keys %HoA ) {
 455      print "$family: $HoA{$family}->@* \n"
 456  }
 457
 458  # print the whole thing with indices
 459  foreach my $family ( keys %HoA ) {
 460      print "family: ";
 461      foreach my $i ( 0 .. $HoA{$family}->$#* ) {
 462          print " $i = $HoA{$family}[$i]";
 463      }
 464      print "\n";
 465  }
 466
 467  # print the whole thing sorted by number of members
 468  foreach my $family ( sort { $HoA{$b}->@* <=> $HoA{$a}->@* } keys %HoA ) {
 469      print "$family: $HoA{$family}->@* \n"
 470  }
 471
 472  # print the whole thing sorted by number of members and name
 473  foreach my $family ( sort {
 474                             $HoA{$b}->@* <=> $HoA{$a}->@*
 475                                           ||
 476                                       $a cmp $b
 477             } keys %HoA )
 478  {
 479      print "$family: ", join(", ", sort $HoA{$family}->@* ), "\n";
 480  }
 481
 482 =head1 ARRAYS OF HASHES
 483 X<array of hashes> X<AoH>
 484
 485 =head2 Declaration of an ARRAY OF HASHES
 486
 487  my @AoH = (
 488         {
 489             Lead     => "fred",
 490             Friend   => "barney",
 491         },
 492         {
 493             Lead     => "george",
 494             Wife     => "jane",
 495             Son      => "elroy",
 496         },
 497         {
 498             Lead     => "homer",
 499             Wife     => "marge",
 500             Son      => "bart",
 501         }
 502   );
 503
 504 =head2 Generation of an ARRAY OF HASHES
 505
 506  # reading from file
 507  # format: LEAD=fred FRIEND=barney
 508  while ( <> ) {
 509      my $rec = {};
 510      for my $field ( split ) {
 511          my ($key, $value) = split /=/, $field;
 512          $rec->{$key} = $value;
 513      }
 514      push @AoH, $rec;
 515  }
 516
 517
 518  # reading from file
 519  # format: LEAD=fred FRIEND=barney
 520  # no temp
 521  while ( <> ) {
 522      push @AoH, { split /[\s+=]/ };
 523  }
 524
 525  # calling a function  that returns a key/value pair list, like
 526  # "lead","fred","daughter","pebbles"
 527  while ( my %fields = getnextpairset() ) {
 528      push @AoH, { %fields };
 529  }
 530
 531  # likewise, but using no temp vars
 532  while (<>) {
 533      push @AoH, { parsepairs($_) };
 534  }
 535
 536  # add key/value to an element
 537  $AoH[0]{pet} = "dino";
 538  $AoH[2]{pet} = "santa's little helper";
 539
 540 =head2 Access and Printing of an ARRAY OF HASHES
 541
 542  # one element
 543  $AoH[0]{lead} = "fred";
 544
 545  # another element
 546  $AoH[1]{lead} =~ s/(\w)/\u$1/;
 547
 548  # print the whole thing with refs
 549  for my $href ( @AoH ) {
 550      print "{ ";
 551      for my $role ( keys %$href ) {
 552          print "$role=$href->{$role} ";
 553      }
 554      print "}\n";
 555  }
 556
 557  # print the whole thing with indices
 558  for my $i ( 0 .. $#AoH ) {
 559      print "$i is { ";
 560      for my $role ( keys $AoH[$i]->%* ) {
 561          print "$role=$AoH[$i]{$role} ";
 562      }
 563      print "}\n";
 564  }
 565
 566  # print the whole thing one at a time
 567  for my $i ( 0 .. $#AoH ) {
 568      for my $role ( keys $AoH[$i]->%* ) {
 569          print "elem at ($i, $role) is $AoH[$i]{$role}\n";
 570      }
 571  }
 572
 573 =head1 HASHES OF HASHES
 574 X<hash of hashes> X<HoH>
 575
 576 =head2 Declaration of a HASH OF HASHES
 577
 578  my %HoH = (
 579         flintstones => {
 580                 lead      => "fred",
 581                 pal       => "barney",
 582         },
 583         jetsons     => {
 584                 lead      => "george",
 585                 wife      => "jane",
 586                 "his boy" => "elroy",
 587         },
 588         simpsons    => {
 589                 lead      => "homer",
 590                 wife      => "marge",
 591                 kid       => "bart",
 592         },
 593  );
 594
 595 =head2 Generation of a HASH OF HASHES
 596
 597  # reading from file
 598  # flintstones: lead=fred pal=barney wife=wilma pet=dino
 599  while ( <> ) {
 600      next unless s/^(.*?):\s*//;
 601      my $who = $1;
 602      for my $field ( split ) {
 603          my ($key, $value) = split /=/, $field;
 604          $HoH{$who}{$key} = $value;
 605      }
 606  }
 607
 608
 609  # reading from file; more temps
 610  while ( <> ) {
 611      next unless s/^(.*?):\s*//;
 612      my $who = $1;
 613      my $rec = {};
 614      $HoH{$who} = $rec;
 615      for my $field ( split ) {
 616          my ($key, $value) = split /=/, $field;
 617          $rec->{$key} = $value;
 618      }
 619  }
 620
 621  # calling a function  that returns a key,value hash
 622  for my $group ( "simpsons", "jetsons", "flintstones" ) {
 623      $HoH{$group} = { get_family($group) };
 624  }
 625
 626  # likewise, but using temps
 627  for my $group ( "simpsons", "jetsons", "flintstones" ) {
 628      my %members = get_family($group);
 629      $HoH{$group} = { %members };
 630  }
 631
 632  # append new members to an existing family
 633  my %new_folks = (
 634      wife => "wilma",
 635      pet  => "dino",
 636  );
 637
 638  for my $what (keys %new_folks) {
 639      $HoH{flintstones}{$what} = $new_folks{$what};
 640  }
 641
 642 =head2 Access and Printing of a HASH OF HASHES
 643
 644  # one element
 645  $HoH{flintstones}{wife} = "wilma";
 646
 647  # another element
 648  $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
 649
 650  # print the whole thing
 651  foreach my $family ( keys %HoH ) {
 652      print "$family: { ";
 653      for my $role ( keys $HoH{$family}->%* ) {
 654          print "$role=$HoH{$family}{$role} ";
 655      }
 656      print "}\n";
 657  }
 658
 659  # print the whole thing  somewhat sorted
 660  foreach my $family ( sort keys %HoH ) {
 661      print "$family: { ";
 662      for my $role ( sort keys $HoH{$family}->%* ) {
 663          print "$role=$HoH{$family}{$role} ";
 664      }
 665      print "}\n";
 666  }
 667
 668
 669  # print the whole thing sorted by number of members
 670  foreach my $family ( sort { $HoH{$b}->%* <=> $HoH{$a}->%* } keys %HoH ) {
 671      print "$family: { ";
 672      for my $role ( sort keys $HoH{$family}->%* ) {
 673          print "$role=$HoH{$family}{$role} ";
 674      }
 675      print "}\n";
 676  }
 677
 678  # establish a sort order (rank) for each role
 679  my $i = 0;
 680  my %rank;
 681  for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
 682
 683  # now print the whole thing sorted by number of members
 684  foreach my $family ( sort { $HoH{$b}->%* <=> $HoH{$a}->%* } keys %HoH ) {
 685      print "$family: { ";
 686      # and print these according to rank order
 687      for my $role ( sort { $rank{$a} <=> $rank{$b} }
 688                                                keys $HoH{$family}->%* )
 689      {
 690          print "$role=$HoH{$family}{$role} ";
 691      }
 692      print "}\n";
 693  }
 694
 695
 696 =head1 MORE ELABORATE RECORDS
 697 X<record> X<structure> X<struct>
 698
 699 =head2 Declaration of MORE ELABORATE RECORDS
 700
 701 Here's a sample showing how to create and use a record whose fields are of
 702 many different sorts:
 703
 704      my $rec = {
 705          TEXT      => $string,
 706          SEQUENCE  => [ @old_values ],
 707          LOOKUP    => { %some_table },
 708          THATCODE  => \&some_function,
 709          THISCODE  => sub { $_[0] ** $_[1] },
 710          HANDLE    => \*STDOUT,
 711      };
 712
 713      print $rec->{TEXT};
 714
 715      print $rec->{SEQUENCE}[0];
 716      my $last = pop $rec->{SEQUENCE}->@*;
 717
 718      print $rec->{LOOKUP}{"key"};
 719      my ($first_k, $first_v) = each $rec->{LOOKUP}->%*;
 720
 721      my $answer = $rec->{THATCODE}->($arg);
 722      $answer = $rec->{THISCODE}->($arg1, $arg2);
 723
 724      # careful of extra block braces on fh ref
 725      print { $rec->{HANDLE} } "a string\n";
 726
 727      use FileHandle;
 728      $rec->{HANDLE}->autoflush(1);
 729      $rec->{HANDLE}->print(" a string\n");
 730
 731 =head2 Declaration of a HASH OF COMPLEX RECORDS
 732
 733      my %TV = (
 734         flintstones => {
 735             series   => "flintstones",
 736             nights   => [ qw(monday thursday friday) ],
 737             members  => [
 738                 { name => "fred",    role => "lead", age  => 36, },
 739                 { name => "wilma",   role => "wife", age  => 31, },
 740                 { name => "pebbles", role => "kid",  age  =>  4, },
 741             ],
 742         },
 743
 744         jetsons     => {
 745             series   => "jetsons",
 746             nights   => [ qw(wednesday saturday) ],
 747             members  => [
 748                 { name => "george",  role => "lead", age  => 41, },
 749                 { name => "jane",    role => "wife", age  => 39, },
 750                 { name => "elroy",   role => "kid",  age  =>  9, },
 751             ],
 752          },
 753
 754         simpsons    => {
 755             series   => "simpsons",
 756             nights   => [ qw(monday) ],
 757             members  => [
 758                 { name => "homer", role => "lead", age  => 34, },
 759                 { name => "marge", role => "wife", age => 37, },
 760                 { name => "bart",  role => "kid",  age  =>  11, },
 761             ],
 762          },
 763       );
 764
 765 =head2 Generation of a HASH OF COMPLEX RECORDS
 766
 767      # reading from file
 768      # this is most easily done by having the file itself be
 769      # in the raw data format as shown above.  perl is happy
 770      # to parse complex data structures if declared as data, so
 771      # sometimes it's easiest to do that
 772
 773      # here's a piece by piece build up
 774      my $rec = {};
 775      $rec->{series} = "flintstones";
 776      $rec->{nights} = [ find_days() ];
 777
 778      my @members = ();
 779      # assume this file in field=value syntax
 780      while (<>) {
 781          my %fields = split /[\s=]+/;
 782          push @members, { %fields };
 783      }
 784      $rec->{members} = [ @members ];
 785
 786      # now remember the whole thing
 787      $TV{ $rec->{series} } = $rec;
 788
 789      ###########################################################
 790      # now, you might want to make interesting extra fields that
 791      # include pointers back into the same data structure so if
 792      # change one piece, it changes everywhere, like for example
 793      # if you wanted a {kids} field that was a reference
 794      # to an array of the kids' records without having duplicate
 795      # records and thus update problems.
 796      ###########################################################
 797      foreach my $family (keys %TV) {
 798          my $rec = $TV{$family}; # temp pointer
 799          my @kids = ();
 800          for my $person ( $rec->{members}->@* ) {
 801              if ($person->{role} =~ /kid|son|daughter/) {
 802                  push @kids, $person;
 803              }
 804          }
 805          # REMEMBER: $rec and $TV{$family} point to same data!!
 806          $rec->{kids} = [ @kids ];
 807      }
 808
 809      # you copied the array, but the array itself contains pointers
 810      # to uncopied objects. this means that if you make bart get
 811      # older via
 812
 813      $TV{simpsons}{kids}[0]{age}++;
 814
 815      # then this would also change in
 816      print $TV{simpsons}{members}[2]{age};
 817
 818      # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
 819      # both point to the same underlying anonymous hash table
 820
 821      # print the whole thing
 822      foreach my $family ( keys %TV ) {
 823          print "the $family";
 824          print " is on during $TV{$family}{nights}->@*\n";
 825          print "its members are:\n";
 826          for my $who ( $TV{$family}{members}->@* ) {
 827              print " $who->{name} ($who->{role}), age $who->{age}\n";
 828          }
 829          print "it turns out that $TV{$family}{lead} has ";
 830          print scalar ( $TV{$family}{kids}->@* ), " kids named ";
 831          print join (", ", map { $_->{name} } $TV{$family}{kids}->@* );
 832          print "\n";
 833      }
 834
 835 =head1 Database Ties
 836
 837 You cannot easily tie a multilevel data structure (such as a hash of
 838 hashes) to a dbm file.  The first problem is that all but GDBM and
 839 Berkeley DB have size limitations, but beyond that, you also have problems
 840 with how references are to be represented on disk.  One experimental
 841 module that does partially attempt to address this need is the MLDBM
 842 module.  Check your nearest CPAN site as described in L<perlmodlib> for
 843 source code to MLDBM.
 844
 845 =head1 SEE ALSO
 846
 847 L<perlref>, L<perllol>, L<perldata>, L<perlobj>
 848
 849 =head1 AUTHOR
 850
 851 Tom Christiansen <F<tchrist@perl.com>>