Commit | Line | Data |
---|---|---|
a0d0e21e LW |
1 | =head1 NAME |
2 | ||
3 | perlobj - Perl objects | |
4 | ||
5 | =head1 DESCRIPTION | |
6 | ||
5f05dabc | 7 | First of all, you need to understand what references are in Perl. |
8 | See L<perlref> for that. Second, if you still find the following | |
9 | reference work too complicated, a tutorial on object-oriented programming | |
10 | in Perl can be found in L<perltoot>. | |
a0d0e21e | 11 | |
5f05dabc | 12 | If you're still with us, then |
13 | here are three very simple definitions that you should find reassuring. | |
a0d0e21e LW |
14 | |
15 | =over 4 | |
16 | ||
17 | =item 1. | |
18 | ||
19 | An object is simply a reference that happens to know which class it | |
20 | belongs to. | |
21 | ||
22 | =item 2. | |
23 | ||
24 | A class is simply a package that happens to provide methods to deal | |
25 | with object references. | |
26 | ||
27 | =item 3. | |
28 | ||
29 | A method is simply a subroutine that expects an object reference (or | |
55497cff | 30 | a package name, for class methods) as the first argument. |
a0d0e21e LW |
31 | |
32 | =back | |
33 | ||
34 | We'll cover these points now in more depth. | |
35 | ||
36 | =head2 An Object is Simply a Reference | |
37 | ||
38 | Unlike say C++, Perl doesn't provide any special syntax for | |
39 | constructors. A constructor is merely a subroutine that returns a | |
cb1a09d0 | 40 | reference to something "blessed" into a class, generally the |
a0d0e21e LW |
41 | class that the subroutine is defined in. Here is a typical |
42 | constructor: | |
43 | ||
44 | package Critter; | |
45 | sub new { bless {} } | |
46 | ||
47 | The C<{}> constructs a reference to an anonymous hash containing no | |
48 | key/value pairs. The bless() takes that reference and tells the object | |
49 | it references that it's now a Critter, and returns the reference. | |
5f05dabc | 50 | This is for convenience, because the referenced object itself knows that |
a0d0e21e LW |
51 | it has been blessed, and its reference to it could have been returned |
52 | directly, like this: | |
53 | ||
54 | sub new { | |
55 | my $self = {}; | |
56 | bless $self; | |
57 | return $self; | |
58 | } | |
59 | ||
60 | In fact, you often see such a thing in more complicated constructors | |
61 | that wish to call methods in the class as part of the construction: | |
62 | ||
63 | sub new { | |
64 | my $self = {} | |
65 | bless $self; | |
66 | $self->initialize(); | |
cb1a09d0 AD |
67 | return $self; |
68 | } | |
69 | ||
1fef88e7 | 70 | If you care about inheritance (and you should; see |
5f05dabc | 71 | L<perlmod/"Modules: Creation, Use, and Abuse">), |
1fef88e7 | 72 | then you want to use the two-arg form of bless |
cb1a09d0 AD |
73 | so that your constructors may be inherited: |
74 | ||
75 | sub new { | |
76 | my $class = shift; | |
77 | my $self = {}; | |
78 | bless $self, $class | |
79 | $self->initialize(); | |
80 | return $self; | |
81 | } | |
82 | ||
d28ebecd | 83 | Or if you expect people to call not just C<CLASS-E<gt>new()> but also |
84 | C<$obj-E<gt>new()>, then use something like this. The initialize() | |
cb1a09d0 AD |
85 | method used will be of whatever $class we blessed the |
86 | object into: | |
87 | ||
88 | sub new { | |
89 | my $this = shift; | |
90 | my $class = ref($this) || $this; | |
91 | my $self = {}; | |
92 | bless $self, $class | |
93 | $self->initialize(); | |
94 | return $self; | |
a0d0e21e LW |
95 | } |
96 | ||
97 | Within the class package, the methods will typically deal with the | |
98 | reference as an ordinary reference. Outside the class package, | |
99 | the reference is generally treated as an opaque value that may | |
5f05dabc | 100 | be accessed only through the class's methods. |
a0d0e21e | 101 | |
748a9306 | 102 | A constructor may re-bless a referenced object currently belonging to |
a0d0e21e | 103 | another class, but then the new class is responsible for all cleanup |
5f05dabc | 104 | later. The previous blessing is forgotten, as an object may belong |
105 | to only one class at a time. (Although of course it's free to | |
a0d0e21e LW |
106 | inherit methods from many classes.) |
107 | ||
108 | A clarification: Perl objects are blessed. References are not. Objects | |
109 | know which package they belong to. References do not. The bless() | |
5f05dabc | 110 | function uses the reference to find the object. Consider |
a0d0e21e LW |
111 | the following example: |
112 | ||
113 | $a = {}; | |
114 | $b = $a; | |
115 | bless $a, BLAH; | |
116 | print "\$b is a ", ref($b), "\n"; | |
117 | ||
118 | This reports $b as being a BLAH, so obviously bless() | |
119 | operated on the object and not on the reference. | |
120 | ||
121 | =head2 A Class is Simply a Package | |
122 | ||
123 | Unlike say C++, Perl doesn't provide any special syntax for class | |
5f05dabc | 124 | definitions. You use a package as a class by putting method |
a0d0e21e LW |
125 | definitions into the class. |
126 | ||
127 | There is a special array within each package called @ISA which says | |
128 | where else to look for a method if you can't find it in the current | |
129 | package. This is how Perl implements inheritance. Each element of the | |
130 | @ISA array is just the name of another package that happens to be a | |
131 | class package. The classes are searched (depth first) for missing | |
132 | methods in the order that they occur in @ISA. The classes accessible | |
cb1a09d0 | 133 | through @ISA are known as base classes of the current class. |
a0d0e21e LW |
134 | |
135 | If a missing method is found in one of the base classes, it is cached | |
136 | in the current class for efficiency. Changing @ISA or defining new | |
137 | subroutines invalidates the cache and causes Perl to do the lookup again. | |
138 | ||
139 | If a method isn't found, but an AUTOLOAD routine is found, then | |
140 | that is called on behalf of the missing method. | |
141 | ||
142 | If neither a method nor an AUTOLOAD routine is found in @ISA, then one | |
143 | last try is made for the method (or an AUTOLOAD routine) in a class | |
a2bdc9a5 | 144 | called UNIVERSAL. (Several commonly used methods are automatically |
145 | supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for | |
146 | more details.) If that doesn't work, Perl finally gives up and | |
a0d0e21e LW |
147 | complains. |
148 | ||
5f05dabc | 149 | Perl classes do only method inheritance. Data inheritance is left |
a0d0e21e LW |
150 | up to the class itself. By and large, this is not a problem in Perl, |
151 | because most classes model the attributes of their object using | |
152 | an anonymous hash, which serves as its own little namespace to be | |
153 | carved up by the various classes that might want to do something | |
154 | with the object. | |
155 | ||
156 | =head2 A Method is Simply a Subroutine | |
157 | ||
158 | Unlike say C++, Perl doesn't provide any special syntax for method | |
159 | definition. (It does provide a little syntax for method invocation | |
160 | though. More on that later.) A method expects its first argument | |
161 | to be the object or package it is being invoked on. There are just two | |
55497cff | 162 | types of methods, which we'll call class and instance. |
163 | (Sometimes you'll hear these called static and virtual, in honor of | |
164 | the two C++ method types they most closely resemble.) | |
a0d0e21e | 165 | |
55497cff | 166 | A class method expects a class name as the first argument. It |
a0d0e21e | 167 | provides functionality for the class as a whole, not for any individual |
55497cff | 168 | object belonging to the class. Constructors are typically class |
5f05dabc | 169 | methods. Many class methods simply ignore their first argument, because |
a0d0e21e | 170 | they already know what package they're in, and don't care what package |
5f05dabc | 171 | they were invoked via. (These aren't necessarily the same, because |
55497cff | 172 | class methods follow the inheritance tree just like ordinary instance |
173 | methods.) Another typical use for class methods is to look up an | |
a0d0e21e LW |
174 | object by name: |
175 | ||
176 | sub find { | |
177 | my ($class, $name) = @_; | |
178 | $objtable{$name}; | |
179 | } | |
180 | ||
55497cff | 181 | An instance method expects an object reference as its first argument. |
a0d0e21e LW |
182 | Typically it shifts the first argument into a "self" or "this" variable, |
183 | and then uses that as an ordinary reference. | |
184 | ||
185 | sub display { | |
186 | my $self = shift; | |
187 | my @keys = @_ ? @_ : sort keys %$self; | |
188 | foreach $key (@keys) { | |
189 | print "\t$key => $self->{$key}\n"; | |
190 | } | |
191 | } | |
192 | ||
193 | =head2 Method Invocation | |
194 | ||
195 | There are two ways to invoke a method, one of which you're already | |
196 | familiar with, and the other of which will look familiar. Perl 4 | |
197 | already had an "indirect object" syntax that you use when you say | |
198 | ||
199 | print STDERR "help!!!\n"; | |
200 | ||
55497cff | 201 | This same syntax can be used to call either class or instance methods. |
202 | We'll use the two methods defined above, the class method to lookup | |
203 | an object reference and the instance method to print out its attributes. | |
a0d0e21e LW |
204 | |
205 | $fred = find Critter "Fred"; | |
206 | display $fred 'Height', 'Weight'; | |
207 | ||
208 | These could be combined into one statement by using a BLOCK in the | |
209 | indirect object slot: | |
210 | ||
211 | display {find Critter "Fred"} 'Height', 'Weight'; | |
212 | ||
d28ebecd | 213 | For C++ fans, there's also a syntax using -E<gt> notation that does exactly |
a0d0e21e LW |
214 | the same thing. The parentheses are required if there are any arguments. |
215 | ||
216 | $fred = Critter->find("Fred"); | |
217 | $fred->display('Height', 'Weight'); | |
218 | ||
219 | or in one statement, | |
220 | ||
221 | Critter->find("Fred")->display('Height', 'Weight'); | |
222 | ||
223 | There are times when one syntax is more readable, and times when the | |
224 | other syntax is more readable. The indirect object syntax is less | |
225 | cluttered, but it has the same ambiguity as ordinary list operators. | |
226 | Indirect object method calls are parsed using the same rule as list | |
227 | operators: "If it looks like a function, it is a function". (Presuming | |
228 | for the moment that you think two words in a row can look like a | |
229 | function name. C++ programmers seem to think so with some regularity, | |
5f05dabc | 230 | especially when the first word is "new".) Thus, the parentheses of |
a0d0e21e LW |
231 | |
232 | new Critter ('Barney', 1.5, 70) | |
233 | ||
234 | are assumed to surround ALL the arguments of the method call, regardless | |
235 | of what comes after. Saying | |
236 | ||
237 | new Critter ('Bam' x 2), 1.4, 45 | |
238 | ||
239 | would be equivalent to | |
240 | ||
241 | Critter->new('Bam' x 2), 1.4, 45 | |
242 | ||
243 | which is unlikely to do what you want. | |
244 | ||
245 | There are times when you wish to specify which class's method to use. | |
246 | In this case, you can call your method as an ordinary subroutine | |
247 | call, being sure to pass the requisite first argument explicitly: | |
248 | ||
249 | $fred = MyCritter::find("Critter", "Fred"); | |
250 | MyCritter::display($fred, 'Height', 'Weight'); | |
251 | ||
5f05dabc | 252 | Note however, that this does not do any inheritance. If you wish |
253 | merely to specify that Perl should I<START> looking for a method in a | |
a0d0e21e LW |
254 | particular package, use an ordinary method call, but qualify the method |
255 | name with the package like this: | |
256 | ||
257 | $fred = Critter->MyCritter::find("Fred"); | |
258 | $fred->MyCritter::display('Height', 'Weight'); | |
259 | ||
cb1a09d0 | 260 | If you're trying to control where the method search begins I<and> you're |
5f05dabc | 261 | executing in the class itself, then you may use the SUPER pseudo class, |
cb1a09d0 | 262 | which says to start looking in your base class's @ISA list without having |
5f05dabc | 263 | to name it explicitly: |
cb1a09d0 AD |
264 | |
265 | $self->SUPER::display('Height', 'Weight'); | |
266 | ||
5f05dabc | 267 | Please note that the C<SUPER::> construct is meaningful I<only> within the |
cb1a09d0 AD |
268 | class. |
269 | ||
748a9306 LW |
270 | Sometimes you want to call a method when you don't know the method name |
271 | ahead of time. You can use the arrow form, replacing the method name | |
272 | with a simple scalar variable containing the method name: | |
273 | ||
274 | $method = $fast ? "findfirst" : "findbest"; | |
275 | $fred->$method(@args); | |
276 | ||
a2bdc9a5 | 277 | =head2 Default UNIVERSAL methods |
278 | ||
279 | The C<UNIVERSAL> package automatically contains the following methods that | |
280 | are inherited by all other classes: | |
281 | ||
282 | =over 4 | |
283 | ||
71be2cbc | 284 | =item isa(CLASS) |
a2bdc9a5 | 285 | |
286 | C<isa> returns I<true> if its object is blessed into a sub-class of C<CLASS> | |
287 | ||
288 | C<isa> is also exportable and can be called as a sub with two arguments. This | |
289 | allows the ability to check what a reference points to. Example | |
290 | ||
291 | use UNIVERSAL qw(isa); | |
292 | ||
293 | if(isa($ref, 'ARRAY')) { | |
294 | ... | |
295 | } | |
296 | ||
71be2cbc | 297 | =item can(METHOD) |
a2bdc9a5 | 298 | |
299 | C<can> checks to see if its object has a method called C<METHOD>, | |
300 | if it does then a reference to the sub is returned, if it does not then | |
301 | I<undef> is returned. | |
302 | ||
71be2cbc | 303 | =item VERSION( [NEED] ) |
760ac839 | 304 | |
71be2cbc | 305 | C<VERSION> returns the version number of the class (package). If the |
306 | NEED argument is given then it will check that the current version (as | |
307 | defined by the $VERSION variable in the given package) not less than | |
308 | NEED; it will die if this is not the case. This method is normally | |
309 | called as a class method. This method is called automatically by the | |
310 | C<VERSION> form of C<use>. | |
a2bdc9a5 | 311 | |
a2bdc9a5 | 312 | use A 1.2 qw(some imported subs); |
71be2cbc | 313 | # implies: |
314 | A->VERSION(1.2); | |
a2bdc9a5 | 315 | |
71be2cbc | 316 | =item class() |
a2bdc9a5 | 317 | |
318 | C<class> returns the class name of its object. | |
319 | ||
71be2cbc | 320 | =item is_instance() |
a2bdc9a5 | 321 | |
322 | C<is_instance> returns true if its object is an instance of some | |
323 | class, false if its object is the class (package) itself. Example | |
324 | ||
325 | A->is_instance(); # False | |
326 | ||
327 | $var = 'A'; | |
328 | $var->is_instance(); # False | |
329 | ||
330 | $ref = bless [], 'A'; | |
331 | $ref->is_instance(); # True | |
332 | ||
a2bdc9a5 | 333 | =back |
334 | ||
335 | B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and | |
336 | C<isa> uses a very similar method and cache-ing strategy. This may cause | |
337 | strange effects if the Perl code dynamically changes @ISA in any package. | |
338 | ||
339 | You may add other methods to the UNIVERSAL class via Perl or XS code. | |
71be2cbc | 340 | You do not need to C<use UNIVERSAL> in order to make these methods |
341 | available to your program. This is necessary only if you wish to | |
342 | have C<isa> available as a plain subroutine in the current package. | |
a2bdc9a5 | 343 | |
344 | =head2 Destructors | |
a0d0e21e LW |
345 | |
346 | When the last reference to an object goes away, the object is | |
347 | automatically destroyed. (This may even be after you exit, if you've | |
348 | stored references in global variables.) If you want to capture control | |
349 | just before the object is freed, you may define a DESTROY method in | |
350 | your class. It will automatically be called at the appropriate moment, | |
351 | and you can do any extra cleanup you need to do. | |
352 | ||
353 | Perl doesn't do nested destruction for you. If your constructor | |
5f05dabc | 354 | re-blessed a reference from one of your base classes, your DESTROY may |
355 | need to call DESTROY for any base classes that need it. But this applies | |
356 | to only re-blessed objects--an object reference that is merely | |
a0d0e21e LW |
357 | I<CONTAINED> in the current object will be freed and destroyed |
358 | automatically when the current object is freed. | |
359 | ||
748a9306 LW |
360 | =head2 WARNING |
361 | ||
362 | An indirect object is limited to a name, a scalar variable, or a block, | |
363 | because it would have to do too much lookahead otherwise, just like any | |
d28ebecd | 364 | other postfix dereference in the language. The left side of -E<gt> is not so |
748a9306 LW |
365 | limited, because it's an infix operator, not a postfix operator. |
366 | ||
367 | That means that below, A and B are equivalent to each other, and C and D | |
368 | are equivalent, but AB and CD are different: | |
369 | ||
370 | A: method $obref->{"fieldname"} | |
371 | B: (method $obref)->{"fieldname"} | |
372 | C: $obref->{"fieldname"}->method() | |
373 | D: method {$obref->{"fieldname"}} | |
374 | ||
a0d0e21e LW |
375 | =head2 Summary |
376 | ||
5f05dabc | 377 | That's about all there is to it. Now you need just to go off and buy a |
a0d0e21e LW |
378 | book about object-oriented design methodology, and bang your forehead |
379 | with it for the next six months or so. | |
380 | ||
cb1a09d0 AD |
381 | =head2 Two-Phased Garbage Collection |
382 | ||
383 | For most purposes, Perl uses a fast and simple reference-based | |
384 | garbage collection system. For this reason, there's an extra | |
385 | dereference going on at some level, so if you haven't built | |
386 | your Perl executable using your C compiler's C<-O> flag, performance | |
387 | will suffer. If you I<have> built Perl with C<cc -O>, then this | |
388 | probably won't matter. | |
389 | ||
390 | A more serious concern is that unreachable memory with a non-zero | |
391 | reference count will not normally get freed. Therefore, this is a bad | |
392 | idea: | |
393 | ||
394 | { | |
395 | my $a; | |
396 | $a = \$a; | |
397 | } | |
398 | ||
399 | Even thought $a I<should> go away, it can't. When building recursive data | |
400 | structures, you'll have to break the self-reference yourself explicitly | |
401 | if you don't care to leak. For example, here's a self-referential | |
402 | node such as one might use in a sophisticated tree structure: | |
403 | ||
404 | sub new_node { | |
405 | my $self = shift; | |
406 | my $class = ref($self) || $self; | |
407 | my $node = {}; | |
408 | $node->{LEFT} = $node->{RIGHT} = $node; | |
409 | $node->{DATA} = [ @_ ]; | |
410 | return bless $node => $class; | |
411 | } | |
412 | ||
413 | If you create nodes like that, they (currently) won't go away unless you | |
414 | break their self reference yourself. (In other words, this is not to be | |
415 | construed as a feature, and you shouldn't depend on it.) | |
416 | ||
417 | Almost. | |
418 | ||
419 | When an interpreter thread finally shuts down (usually when your program | |
420 | exits), then a rather costly but complete mark-and-sweep style of garbage | |
421 | collection is performed, and everything allocated by that thread gets | |
422 | destroyed. This is essential to support Perl as an embedded or a | |
5f05dabc | 423 | multi-threadable language. For example, this program demonstrates Perl's |
cb1a09d0 AD |
424 | two-phased garbage collection: |
425 | ||
426 | #!/usr/bin/perl | |
427 | package Subtle; | |
428 | ||
429 | sub new { | |
430 | my $test; | |
431 | $test = \$test; | |
432 | warn "CREATING " . \$test; | |
433 | return bless \$test; | |
434 | } | |
435 | ||
436 | sub DESTROY { | |
437 | my $self = shift; | |
438 | warn "DESTROYING $self"; | |
439 | } | |
440 | ||
441 | package main; | |
442 | ||
443 | warn "starting program"; | |
444 | { | |
445 | my $a = Subtle->new; | |
446 | my $b = Subtle->new; | |
447 | $$a = 0; # break selfref | |
448 | warn "leaving block"; | |
449 | } | |
450 | ||
451 | warn "just exited block"; | |
452 | warn "time to die..."; | |
453 | exit; | |
454 | ||
455 | When run as F</tmp/test>, the following output is produced: | |
456 | ||
457 | starting program at /tmp/test line 18. | |
458 | CREATING SCALAR(0x8e5b8) at /tmp/test line 7. | |
459 | CREATING SCALAR(0x8e57c) at /tmp/test line 7. | |
460 | leaving block at /tmp/test line 23. | |
461 | DESTROYING Subtle=SCALAR(0x8e5b8) at /tmp/test line 13. | |
462 | just exited block at /tmp/test line 26. | |
463 | time to die... at /tmp/test line 27. | |
464 | DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. | |
465 | ||
466 | Notice that "global destruction" bit there? That's the thread | |
467 | garbage collector reaching the unreachable. | |
468 | ||
469 | Objects are always destructed, even when regular refs aren't and in fact | |
470 | are destructed in a separate pass before ordinary refs just to try to | |
471 | prevent object destructors from using refs that have been themselves | |
5f05dabc | 472 | destructed. Plain refs are only garbage-collected if the destruct level |
cb1a09d0 AD |
473 | is greater than 0. You can test the higher levels of global destruction |
474 | by setting the PERL_DESTRUCT_LEVEL environment variable, presuming | |
475 | C<-DDEBUGGING> was enabled during perl build time. | |
476 | ||
477 | A more complete garbage collection strategy will be implemented | |
478 | at a future date. | |
479 | ||
a0d0e21e LW |
480 | =head1 SEE ALSO |
481 | ||
5f05dabc | 482 | A kinder, gentler tutorial on object-oriented programming in Perl can |
483 | be found in L<perltoot>. | |
cb1a09d0 AD |
484 | You should also check out L<perlbot> for other object tricks, traps, and tips, |
485 | as well as L<perlmod> for some style guides on constructing both modules | |
486 | and classes. |