| 1 | =head1 NAME |
| 2 | |
| 3 | perlobj - Perl objects |
| 4 | |
| 5 | =head1 DESCRIPTION |
| 6 | |
| 7 | First of all, you need to understand what references are in Perl. |
| 8 | See L<perlref> for that. Second, if you still find the following |
| 9 | reference work too complicated, a tutorial on object-oriented programming |
| 10 | in Perl can be found in L<perltoot>. |
| 11 | |
| 12 | If you're still with us, then |
| 13 | here are three very simple definitions that you should find reassuring. |
| 14 | |
| 15 | =over 4 |
| 16 | |
| 17 | =item 1. |
| 18 | |
| 19 | An object is simply a reference that happens to know which class it |
| 20 | belongs to. |
| 21 | |
| 22 | =item 2. |
| 23 | |
| 24 | A class is simply a package that happens to provide methods to deal |
| 25 | with object references. |
| 26 | |
| 27 | =item 3. |
| 28 | |
| 29 | A method is simply a subroutine that expects an object reference (or |
| 30 | a package name, for class methods) as the first argument. |
| 31 | |
| 32 | =back |
| 33 | |
| 34 | We'll cover these points now in more depth. |
| 35 | |
| 36 | =head2 An Object is Simply a Reference |
| 37 | |
| 38 | Unlike say C++, Perl doesn't provide any special syntax for |
| 39 | constructors. A constructor is merely a subroutine that returns a |
| 40 | reference to something "blessed" into a class, generally the |
| 41 | class that the subroutine is defined in. Here is a typical |
| 42 | constructor: |
| 43 | |
| 44 | package Critter; |
| 45 | sub new { bless {} } |
| 46 | |
| 47 | The C<{}> constructs a reference to an anonymous hash containing no |
| 48 | key/value pairs. The bless() takes that reference and tells the object |
| 49 | it references that it's now a Critter, and returns the reference. |
| 50 | This is for convenience, because the referenced object itself knows that |
| 51 | it has been blessed, and its reference to it could have been returned |
| 52 | directly, like this: |
| 53 | |
| 54 | sub new { |
| 55 | my $self = {}; |
| 56 | bless $self; |
| 57 | return $self; |
| 58 | } |
| 59 | |
| 60 | In fact, you often see such a thing in more complicated constructors |
| 61 | that wish to call methods in the class as part of the construction: |
| 62 | |
| 63 | sub new { |
| 64 | my $self = {} |
| 65 | bless $self; |
| 66 | $self->initialize(); |
| 67 | return $self; |
| 68 | } |
| 69 | |
| 70 | If you care about inheritance (and you should; see |
| 71 | L<perlmod/"Modules: Creation, Use, and Abuse">), |
| 72 | then you want to use the two-arg form of bless |
| 73 | so that your constructors may be inherited: |
| 74 | |
| 75 | sub new { |
| 76 | my $class = shift; |
| 77 | my $self = {}; |
| 78 | bless $self, $class |
| 79 | $self->initialize(); |
| 80 | return $self; |
| 81 | } |
| 82 | |
| 83 | Or if you expect people to call not just C<CLASS-E<gt>new()> but also |
| 84 | C<$obj-E<gt>new()>, then use something like this. The initialize() |
| 85 | method used will be of whatever $class we blessed the |
| 86 | object into: |
| 87 | |
| 88 | sub new { |
| 89 | my $this = shift; |
| 90 | my $class = ref($this) || $this; |
| 91 | my $self = {}; |
| 92 | bless $self, $class |
| 93 | $self->initialize(); |
| 94 | return $self; |
| 95 | } |
| 96 | |
| 97 | Within the class package, the methods will typically deal with the |
| 98 | reference as an ordinary reference. Outside the class package, |
| 99 | the reference is generally treated as an opaque value that may |
| 100 | be accessed only through the class's methods. |
| 101 | |
| 102 | A constructor may re-bless a referenced object currently belonging to |
| 103 | another class, but then the new class is responsible for all cleanup |
| 104 | later. The previous blessing is forgotten, as an object may belong |
| 105 | to only one class at a time. (Although of course it's free to |
| 106 | inherit methods from many classes.) |
| 107 | |
| 108 | A clarification: Perl objects are blessed. References are not. Objects |
| 109 | know which package they belong to. References do not. The bless() |
| 110 | function uses the reference to find the object. Consider |
| 111 | the following example: |
| 112 | |
| 113 | $a = {}; |
| 114 | $b = $a; |
| 115 | bless $a, BLAH; |
| 116 | print "\$b is a ", ref($b), "\n"; |
| 117 | |
| 118 | This reports $b as being a BLAH, so obviously bless() |
| 119 | operated on the object and not on the reference. |
| 120 | |
| 121 | =head2 A Class is Simply a Package |
| 122 | |
| 123 | Unlike say C++, Perl doesn't provide any special syntax for class |
| 124 | definitions. You use a package as a class by putting method |
| 125 | definitions into the class. |
| 126 | |
| 127 | There is a special array within each package called @ISA which says |
| 128 | where else to look for a method if you can't find it in the current |
| 129 | package. This is how Perl implements inheritance. Each element of the |
| 130 | @ISA array is just the name of another package that happens to be a |
| 131 | class package. The classes are searched (depth first) for missing |
| 132 | methods in the order that they occur in @ISA. The classes accessible |
| 133 | through @ISA are known as base classes of the current class. |
| 134 | |
| 135 | If a missing method is found in one of the base classes, it is cached |
| 136 | in the current class for efficiency. Changing @ISA or defining new |
| 137 | subroutines invalidates the cache and causes Perl to do the lookup again. |
| 138 | |
| 139 | If a method isn't found, but an AUTOLOAD routine is found, then |
| 140 | that is called on behalf of the missing method. |
| 141 | |
| 142 | If neither a method nor an AUTOLOAD routine is found in @ISA, then one |
| 143 | last try is made for the method (or an AUTOLOAD routine) in a class |
| 144 | called UNIVERSAL. (Several commonly used methods are automatically |
| 145 | supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for |
| 146 | more details.) If that doesn't work, Perl finally gives up and |
| 147 | complains. |
| 148 | |
| 149 | Perl classes do only method inheritance. Data inheritance is left |
| 150 | up to the class itself. By and large, this is not a problem in Perl, |
| 151 | because most classes model the attributes of their object using |
| 152 | an anonymous hash, which serves as its own little namespace to be |
| 153 | carved up by the various classes that might want to do something |
| 154 | with the object. |
| 155 | |
| 156 | =head2 A Method is Simply a Subroutine |
| 157 | |
| 158 | Unlike say C++, Perl doesn't provide any special syntax for method |
| 159 | definition. (It does provide a little syntax for method invocation |
| 160 | though. More on that later.) A method expects its first argument |
| 161 | to be the object or package it is being invoked on. There are just two |
| 162 | types of methods, which we'll call class and instance. |
| 163 | (Sometimes you'll hear these called static and virtual, in honor of |
| 164 | the two C++ method types they most closely resemble.) |
| 165 | |
| 166 | A class method expects a class name as the first argument. It |
| 167 | provides functionality for the class as a whole, not for any individual |
| 168 | object belonging to the class. Constructors are typically class |
| 169 | methods. Many class methods simply ignore their first argument, because |
| 170 | they already know what package they're in, and don't care what package |
| 171 | they were invoked via. (These aren't necessarily the same, because |
| 172 | class methods follow the inheritance tree just like ordinary instance |
| 173 | methods.) Another typical use for class methods is to look up an |
| 174 | object by name: |
| 175 | |
| 176 | sub find { |
| 177 | my ($class, $name) = @_; |
| 178 | $objtable{$name}; |
| 179 | } |
| 180 | |
| 181 | An instance method expects an object reference as its first argument. |
| 182 | Typically it shifts the first argument into a "self" or "this" variable, |
| 183 | and then uses that as an ordinary reference. |
| 184 | |
| 185 | sub display { |
| 186 | my $self = shift; |
| 187 | my @keys = @_ ? @_ : sort keys %$self; |
| 188 | foreach $key (@keys) { |
| 189 | print "\t$key => $self->{$key}\n"; |
| 190 | } |
| 191 | } |
| 192 | |
| 193 | =head2 Method Invocation |
| 194 | |
| 195 | There are two ways to invoke a method, one of which you're already |
| 196 | familiar with, and the other of which will look familiar. Perl 4 |
| 197 | already had an "indirect object" syntax that you use when you say |
| 198 | |
| 199 | print STDERR "help!!!\n"; |
| 200 | |
| 201 | This same syntax can be used to call either class or instance methods. |
| 202 | We'll use the two methods defined above, the class method to lookup |
| 203 | an object reference and the instance method to print out its attributes. |
| 204 | |
| 205 | $fred = find Critter "Fred"; |
| 206 | display $fred 'Height', 'Weight'; |
| 207 | |
| 208 | These could be combined into one statement by using a BLOCK in the |
| 209 | indirect object slot: |
| 210 | |
| 211 | display {find Critter "Fred"} 'Height', 'Weight'; |
| 212 | |
| 213 | For C++ fans, there's also a syntax using -E<gt> notation that does exactly |
| 214 | the same thing. The parentheses are required if there are any arguments. |
| 215 | |
| 216 | $fred = Critter->find("Fred"); |
| 217 | $fred->display('Height', 'Weight'); |
| 218 | |
| 219 | or in one statement, |
| 220 | |
| 221 | Critter->find("Fred")->display('Height', 'Weight'); |
| 222 | |
| 223 | There are times when one syntax is more readable, and times when the |
| 224 | other syntax is more readable. The indirect object syntax is less |
| 225 | cluttered, but it has the same ambiguity as ordinary list operators. |
| 226 | Indirect object method calls are parsed using the same rule as list |
| 227 | operators: "If it looks like a function, it is a function". (Presuming |
| 228 | for the moment that you think two words in a row can look like a |
| 229 | function name. C++ programmers seem to think so with some regularity, |
| 230 | especially when the first word is "new".) Thus, the parentheses of |
| 231 | |
| 232 | new Critter ('Barney', 1.5, 70) |
| 233 | |
| 234 | are assumed to surround ALL the arguments of the method call, regardless |
| 235 | of what comes after. Saying |
| 236 | |
| 237 | new Critter ('Bam' x 2), 1.4, 45 |
| 238 | |
| 239 | would be equivalent to |
| 240 | |
| 241 | Critter->new('Bam' x 2), 1.4, 45 |
| 242 | |
| 243 | which is unlikely to do what you want. |
| 244 | |
| 245 | There are times when you wish to specify which class's method to use. |
| 246 | In this case, you can call your method as an ordinary subroutine |
| 247 | call, being sure to pass the requisite first argument explicitly: |
| 248 | |
| 249 | $fred = MyCritter::find("Critter", "Fred"); |
| 250 | MyCritter::display($fred, 'Height', 'Weight'); |
| 251 | |
| 252 | Note however, that this does not do any inheritance. If you wish |
| 253 | merely to specify that Perl should I<START> looking for a method in a |
| 254 | particular package, use an ordinary method call, but qualify the method |
| 255 | name with the package like this: |
| 256 | |
| 257 | $fred = Critter->MyCritter::find("Fred"); |
| 258 | $fred->MyCritter::display('Height', 'Weight'); |
| 259 | |
| 260 | If you're trying to control where the method search begins I<and> you're |
| 261 | executing in the class itself, then you may use the SUPER pseudo class, |
| 262 | which says to start looking in your base class's @ISA list without having |
| 263 | to name it explicitly: |
| 264 | |
| 265 | $self->SUPER::display('Height', 'Weight'); |
| 266 | |
| 267 | Please note that the C<SUPER::> construct is meaningful I<only> within the |
| 268 | class. |
| 269 | |
| 270 | Sometimes you want to call a method when you don't know the method name |
| 271 | ahead of time. You can use the arrow form, replacing the method name |
| 272 | with a simple scalar variable containing the method name: |
| 273 | |
| 274 | $method = $fast ? "findfirst" : "findbest"; |
| 275 | $fred->$method(@args); |
| 276 | |
| 277 | =head2 Default UNIVERSAL methods |
| 278 | |
| 279 | The C<UNIVERSAL> package automatically contains the following methods that |
| 280 | are inherited by all other classes: |
| 281 | |
| 282 | =over 4 |
| 283 | |
| 284 | =item isa(CLASS) |
| 285 | |
| 286 | C<isa> returns I<true> if its object is blessed into a sub-class of C<CLASS> |
| 287 | |
| 288 | C<isa> is also exportable and can be called as a sub with two arguments. This |
| 289 | allows the ability to check what a reference points to. Example |
| 290 | |
| 291 | use UNIVERSAL qw(isa); |
| 292 | |
| 293 | if(isa($ref, 'ARRAY')) { |
| 294 | ... |
| 295 | } |
| 296 | |
| 297 | =item can(METHOD) |
| 298 | |
| 299 | C<can> checks to see if its object has a method called C<METHOD>, |
| 300 | if it does then a reference to the sub is returned, if it does not then |
| 301 | I<undef> is returned. |
| 302 | |
| 303 | =item VERSION( [NEED] ) |
| 304 | |
| 305 | C<VERSION> returns the version number of the class (package). If the |
| 306 | NEED argument is given then it will check that the current version (as |
| 307 | defined by the $VERSION variable in the given package) not less than |
| 308 | NEED; it will die if this is not the case. This method is normally |
| 309 | called as a class method. This method is called automatically by the |
| 310 | C<VERSION> form of C<use>. |
| 311 | |
| 312 | use A 1.2 qw(some imported subs); |
| 313 | # implies: |
| 314 | A->VERSION(1.2); |
| 315 | |
| 316 | =item class() |
| 317 | |
| 318 | C<class> returns the class name of its object. |
| 319 | |
| 320 | =item is_instance() |
| 321 | |
| 322 | C<is_instance> returns true if its object is an instance of some |
| 323 | class, false if its object is the class (package) itself. Example |
| 324 | |
| 325 | A->is_instance(); # False |
| 326 | |
| 327 | $var = 'A'; |
| 328 | $var->is_instance(); # False |
| 329 | |
| 330 | $ref = bless [], 'A'; |
| 331 | $ref->is_instance(); # True |
| 332 | |
| 333 | =back |
| 334 | |
| 335 | B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and |
| 336 | C<isa> uses a very similar method and cache-ing strategy. This may cause |
| 337 | strange effects if the Perl code dynamically changes @ISA in any package. |
| 338 | |
| 339 | You may add other methods to the UNIVERSAL class via Perl or XS code. |
| 340 | You do not need to C<use UNIVERSAL> in order to make these methods |
| 341 | available to your program. This is necessary only if you wish to |
| 342 | have C<isa> available as a plain subroutine in the current package. |
| 343 | |
| 344 | =head2 Destructors |
| 345 | |
| 346 | When the last reference to an object goes away, the object is |
| 347 | automatically destroyed. (This may even be after you exit, if you've |
| 348 | stored references in global variables.) If you want to capture control |
| 349 | just before the object is freed, you may define a DESTROY method in |
| 350 | your class. It will automatically be called at the appropriate moment, |
| 351 | and you can do any extra cleanup you need to do. |
| 352 | |
| 353 | Perl doesn't do nested destruction for you. If your constructor |
| 354 | re-blessed a reference from one of your base classes, your DESTROY may |
| 355 | need to call DESTROY for any base classes that need it. But this applies |
| 356 | to only re-blessed objects--an object reference that is merely |
| 357 | I<CONTAINED> in the current object will be freed and destroyed |
| 358 | automatically when the current object is freed. |
| 359 | |
| 360 | =head2 WARNING |
| 361 | |
| 362 | An indirect object is limited to a name, a scalar variable, or a block, |
| 363 | because it would have to do too much lookahead otherwise, just like any |
| 364 | other postfix dereference in the language. The left side of -E<gt> is not so |
| 365 | limited, because it's an infix operator, not a postfix operator. |
| 366 | |
| 367 | That means that below, A and B are equivalent to each other, and C and D |
| 368 | are equivalent, but AB and CD are different: |
| 369 | |
| 370 | A: method $obref->{"fieldname"} |
| 371 | B: (method $obref)->{"fieldname"} |
| 372 | C: $obref->{"fieldname"}->method() |
| 373 | D: method {$obref->{"fieldname"}} |
| 374 | |
| 375 | =head2 Summary |
| 376 | |
| 377 | That's about all there is to it. Now you need just to go off and buy a |
| 378 | book about object-oriented design methodology, and bang your forehead |
| 379 | with it for the next six months or so. |
| 380 | |
| 381 | =head2 Two-Phased Garbage Collection |
| 382 | |
| 383 | For most purposes, Perl uses a fast and simple reference-based |
| 384 | garbage collection system. For this reason, there's an extra |
| 385 | dereference going on at some level, so if you haven't built |
| 386 | your Perl executable using your C compiler's C<-O> flag, performance |
| 387 | will suffer. If you I<have> built Perl with C<cc -O>, then this |
| 388 | probably won't matter. |
| 389 | |
| 390 | A more serious concern is that unreachable memory with a non-zero |
| 391 | reference count will not normally get freed. Therefore, this is a bad |
| 392 | idea: |
| 393 | |
| 394 | { |
| 395 | my $a; |
| 396 | $a = \$a; |
| 397 | } |
| 398 | |
| 399 | Even thought $a I<should> go away, it can't. When building recursive data |
| 400 | structures, you'll have to break the self-reference yourself explicitly |
| 401 | if you don't care to leak. For example, here's a self-referential |
| 402 | node such as one might use in a sophisticated tree structure: |
| 403 | |
| 404 | sub new_node { |
| 405 | my $self = shift; |
| 406 | my $class = ref($self) || $self; |
| 407 | my $node = {}; |
| 408 | $node->{LEFT} = $node->{RIGHT} = $node; |
| 409 | $node->{DATA} = [ @_ ]; |
| 410 | return bless $node => $class; |
| 411 | } |
| 412 | |
| 413 | If you create nodes like that, they (currently) won't go away unless you |
| 414 | break their self reference yourself. (In other words, this is not to be |
| 415 | construed as a feature, and you shouldn't depend on it.) |
| 416 | |
| 417 | Almost. |
| 418 | |
| 419 | When an interpreter thread finally shuts down (usually when your program |
| 420 | exits), then a rather costly but complete mark-and-sweep style of garbage |
| 421 | collection is performed, and everything allocated by that thread gets |
| 422 | destroyed. This is essential to support Perl as an embedded or a |
| 423 | multi-threadable language. For example, this program demonstrates Perl's |
| 424 | two-phased garbage collection: |
| 425 | |
| 426 | #!/usr/bin/perl |
| 427 | package Subtle; |
| 428 | |
| 429 | sub new { |
| 430 | my $test; |
| 431 | $test = \$test; |
| 432 | warn "CREATING " . \$test; |
| 433 | return bless \$test; |
| 434 | } |
| 435 | |
| 436 | sub DESTROY { |
| 437 | my $self = shift; |
| 438 | warn "DESTROYING $self"; |
| 439 | } |
| 440 | |
| 441 | package main; |
| 442 | |
| 443 | warn "starting program"; |
| 444 | { |
| 445 | my $a = Subtle->new; |
| 446 | my $b = Subtle->new; |
| 447 | $$a = 0; # break selfref |
| 448 | warn "leaving block"; |
| 449 | } |
| 450 | |
| 451 | warn "just exited block"; |
| 452 | warn "time to die..."; |
| 453 | exit; |
| 454 | |
| 455 | When run as F</tmp/test>, the following output is produced: |
| 456 | |
| 457 | starting program at /tmp/test line 18. |
| 458 | CREATING SCALAR(0x8e5b8) at /tmp/test line 7. |
| 459 | CREATING SCALAR(0x8e57c) at /tmp/test line 7. |
| 460 | leaving block at /tmp/test line 23. |
| 461 | DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13. |
| 462 | just exited block at /tmp/test line 26. |
| 463 | time to die... at /tmp/test line 27. |
| 464 | DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. |
| 465 | |
| 466 | Notice that "global destruction" bit there? That's the thread |
| 467 | garbage collector reaching the unreachable. |
| 468 | |
| 469 | Objects are always destructed, even when regular refs aren't and in fact |
| 470 | are destructed in a separate pass before ordinary refs just to try to |
| 471 | prevent object destructors from using refs that have been themselves |
| 472 | destructed. Plain refs are only garbage-collected if the destruct level |
| 473 | is greater than 0. You can test the higher levels of global destruction |
| 474 | by setting the PERL_DESTRUCT_LEVEL environment variable, presuming |
| 475 | C<-DDEBUGGING> was enabled during perl build time. |
| 476 | |
| 477 | A more complete garbage collection strategy will be implemented |
| 478 | at a future date. |
| 479 | |
| 480 | =head1 SEE ALSO |
| 481 | |
| 482 | A kinder, gentler tutorial on object-oriented programming in Perl can |
| 483 | be found in L<perltoot>. |
| 484 | You should also check out L<perlbot> for other object tricks, traps, and tips, |
| 485 | as well as L<perlmod> for some style guides on constructing both modules |
| 486 | and classes. |