Commit | Line | Data |
---|---|---|
a0d0e21e | 1 | =head1 NAME |
d74e8afc | 2 | X<object> X<OOP> |
a0d0e21e LW |
3 | |
4 | perlobj - Perl objects | |
5 | ||
6 | =head1 DESCRIPTION | |
7 | ||
14218588 | 8 | First you need to understand what references are in Perl. |
5f05dabc | 9 | See L<perlref> for that. Second, if you still find the following |
10 | reference work too complicated, a tutorial on object-oriented programming | |
890a53b9 | 11 | in Perl can be found in L<perltoot> and L<perltooc>. |
a0d0e21e | 12 | |
54310121 | 13 | If you're still with us, then |
5f05dabc | 14 | here are three very simple definitions that you should find reassuring. |
a0d0e21e LW |
15 | |
16 | =over 4 | |
17 | ||
18 | =item 1. | |
19 | ||
20 | An object is simply a reference that happens to know which class it | |
21 | belongs to. | |
22 | ||
23 | =item 2. | |
24 | ||
25 | A class is simply a package that happens to provide methods to deal | |
26 | with object references. | |
27 | ||
28 | =item 3. | |
29 | ||
30 | A method is simply a subroutine that expects an object reference (or | |
55497cff | 31 | a package name, for class methods) as the first argument. |
a0d0e21e LW |
32 | |
33 | =back | |
34 | ||
35 | We'll cover these points now in more depth. | |
36 | ||
37 | =head2 An Object is Simply a Reference | |
d74e8afc | 38 | X<object> X<bless> X<constructor> X<new> |
a0d0e21e LW |
39 | |
40 | Unlike say C++, Perl doesn't provide any special syntax for | |
41 | constructors. A constructor is merely a subroutine that returns a | |
cb1a09d0 | 42 | reference to something "blessed" into a class, generally the |
a0d0e21e LW |
43 | class that the subroutine is defined in. Here is a typical |
44 | constructor: | |
45 | ||
46 | package Critter; | |
47 | sub new { bless {} } | |
48 | ||
5a964f20 TC |
49 | That word C<new> isn't special. You could have written |
50 | a construct this way, too: | |
51 | ||
52 | package Critter; | |
53 | sub spawn { bless {} } | |
54 | ||
14218588 | 55 | This might even be preferable, because the C++ programmers won't |
5a964f20 TC |
56 | be tricked into thinking that C<new> works in Perl as it does in C++. |
57 | It doesn't. We recommend that you name your constructors whatever | |
58 | makes sense in the context of the problem you're solving. For example, | |
59 | constructors in the Tk extension to Perl are named after the widgets | |
60 | they create. | |
61 | ||
62 | One thing that's different about Perl constructors compared with those in | |
63 | C++ is that in Perl, they have to allocate their own memory. (The other | |
64 | things is that they don't automatically call overridden base-class | |
65 | constructors.) The C<{}> allocates an anonymous hash containing no | |
66 | key/value pairs, and returns it The bless() takes that reference and | |
67 | tells the object it references that it's now a Critter, and returns | |
68 | the reference. This is for convenience, because the referenced object | |
69 | itself knows that it has been blessed, and the reference to it could | |
70 | have been returned directly, like this: | |
a0d0e21e LW |
71 | |
72 | sub new { | |
73 | my $self = {}; | |
74 | bless $self; | |
75 | return $self; | |
76 | } | |
77 | ||
14218588 | 78 | You often see such a thing in more complicated constructors |
a0d0e21e LW |
79 | that wish to call methods in the class as part of the construction: |
80 | ||
81 | sub new { | |
5a964f20 | 82 | my $self = {}; |
a0d0e21e LW |
83 | bless $self; |
84 | $self->initialize(); | |
cb1a09d0 AD |
85 | return $self; |
86 | } | |
87 | ||
1fef88e7 | 88 | If you care about inheritance (and you should; see |
b687b08b | 89 | L<perlmodlib/"Modules: Creation, Use, and Abuse">), |
1fef88e7 | 90 | then you want to use the two-arg form of bless |
cb1a09d0 AD |
91 | so that your constructors may be inherited: |
92 | ||
93 | sub new { | |
94 | my $class = shift; | |
95 | my $self = {}; | |
5a964f20 | 96 | bless $self, $class; |
cb1a09d0 AD |
97 | $self->initialize(); |
98 | return $self; | |
99 | } | |
100 | ||
c47ff5f1 | 101 | Or if you expect people to call not just C<< CLASS->new() >> but also |
eac7fe86 CP |
102 | C<< $obj->new() >>, then use something like the following. (Note that using |
103 | this to call new() on an instance does not automatically perform any | |
104 | copying. If you want a shallow or deep copy of an object, you'll have to | |
105 | specifically allow for that.) The initialize() method used will be of | |
106 | whatever $class we blessed the object into: | |
cb1a09d0 AD |
107 | |
108 | sub new { | |
109 | my $this = shift; | |
110 | my $class = ref($this) || $this; | |
111 | my $self = {}; | |
5a964f20 | 112 | bless $self, $class; |
cb1a09d0 AD |
113 | $self->initialize(); |
114 | return $self; | |
a0d0e21e LW |
115 | } |
116 | ||
117 | Within the class package, the methods will typically deal with the | |
118 | reference as an ordinary reference. Outside the class package, | |
119 | the reference is generally treated as an opaque value that may | |
5f05dabc | 120 | be accessed only through the class's methods. |
a0d0e21e | 121 | |
14218588 | 122 | Although a constructor can in theory re-bless a referenced object |
19799a22 GS |
123 | currently belonging to another class, this is almost certainly going |
124 | to get you into trouble. The new class is responsible for all | |
125 | cleanup later. The previous blessing is forgotten, as an object | |
126 | may belong to only one class at a time. (Although of course it's | |
127 | free to inherit methods from many classes.) If you find yourself | |
128 | having to do this, the parent class is probably misbehaving, though. | |
a0d0e21e LW |
129 | |
130 | A clarification: Perl objects are blessed. References are not. Objects | |
131 | know which package they belong to. References do not. The bless() | |
5f05dabc | 132 | function uses the reference to find the object. Consider |
a0d0e21e LW |
133 | the following example: |
134 | ||
135 | $a = {}; | |
136 | $b = $a; | |
137 | bless $a, BLAH; | |
138 | print "\$b is a ", ref($b), "\n"; | |
139 | ||
54310121 | 140 | This reports $b as being a BLAH, so obviously bless() |
a0d0e21e LW |
141 | operated on the object and not on the reference. |
142 | ||
143 | =head2 A Class is Simply a Package | |
d74e8afc | 144 | X<class> X<package> X<@ISA> X<inheritance> |
a0d0e21e LW |
145 | |
146 | Unlike say C++, Perl doesn't provide any special syntax for class | |
5f05dabc | 147 | definitions. You use a package as a class by putting method |
a0d0e21e LW |
148 | definitions into the class. |
149 | ||
5a964f20 | 150 | There is a special array within each package called @ISA, which says |
a0d0e21e LW |
151 | where else to look for a method if you can't find it in the current |
152 | package. This is how Perl implements inheritance. Each element of the | |
153 | @ISA array is just the name of another package that happens to be a | |
dd69841b BB |
154 | class package. The classes are searched for missing methods in |
155 | depth-first, left-to-right order by default (see L<mro> for alternative | |
156 | search order and other in-depth information). The classes accessible | |
54310121 | 157 | through @ISA are known as base classes of the current class. |
a0d0e21e | 158 | |
5a964f20 TC |
159 | All classes implicitly inherit from class C<UNIVERSAL> as their |
160 | last base class. Several commonly used methods are automatically | |
003db2bd RS |
161 | supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> or |
162 | L<UNIVERSAL|UNIVERSAL> for more details. | |
d74e8afc | 163 | X<UNIVERSAL> X<base class> X<class, base> |
5a964f20 | 164 | |
14218588 | 165 | If a missing method is found in a base class, it is cached |
a0d0e21e LW |
166 | in the current class for efficiency. Changing @ISA or defining new |
167 | subroutines invalidates the cache and causes Perl to do the lookup again. | |
168 | ||
5a964f20 TC |
169 | If neither the current class, its named base classes, nor the UNIVERSAL |
170 | class contains the requested method, these three places are searched | |
171 | all over again, this time looking for a method named AUTOLOAD(). If an | |
172 | AUTOLOAD is found, this method is called on behalf of the missing method, | |
173 | setting the package global $AUTOLOAD to be the fully qualified name of | |
174 | the method that was intended to be called. | |
d74e8afc | 175 | X<AUTOLOAD> |
5a964f20 TC |
176 | |
177 | If none of that works, Perl finally gives up and complains. | |
178 | ||
ed850460 | 179 | If you want to stop the AUTOLOAD inheritance say simply |
d74e8afc | 180 | X<AUTOLOAD> |
ed850460 JH |
181 | |
182 | sub AUTOLOAD; | |
183 | ||
184 | and the call will die using the name of the sub being called. | |
185 | ||
5a964f20 TC |
186 | Perl classes do method inheritance only. Data inheritance is left up |
187 | to the class itself. By and large, this is not a problem in Perl, | |
188 | because most classes model the attributes of their object using an | |
189 | anonymous hash, which serves as its own little namespace to be carved up | |
190 | by the various classes that might want to do something with the object. | |
191 | The only problem with this is that you can't sure that you aren't using | |
192 | a piece of the hash that isn't already used. A reasonable workaround | |
193 | is to prepend your fieldname in the hash with the package name. | |
d74e8afc | 194 | X<inheritance, method> X<inheritance, data> |
5a964f20 TC |
195 | |
196 | sub bump { | |
197 | my $self = shift; | |
198 | $self->{ __PACKAGE__ . ".count"}++; | |
199 | } | |
a0d0e21e LW |
200 | |
201 | =head2 A Method is Simply a Subroutine | |
d74e8afc | 202 | X<method> |
a0d0e21e LW |
203 | |
204 | Unlike say C++, Perl doesn't provide any special syntax for method | |
205 | definition. (It does provide a little syntax for method invocation | |
206 | though. More on that later.) A method expects its first argument | |
19799a22 GS |
207 | to be the object (reference) or package (string) it is being invoked |
208 | on. There are two ways of calling methods, which we'll call class | |
209 | methods and instance methods. | |
a0d0e21e | 210 | |
55497cff | 211 | A class method expects a class name as the first argument. It |
19799a22 GS |
212 | provides functionality for the class as a whole, not for any |
213 | individual object belonging to the class. Constructors are often | |
890a53b9 | 214 | class methods, but see L<perltoot> and L<perltooc> for alternatives. |
19799a22 GS |
215 | Many class methods simply ignore their first argument, because they |
216 | already know what package they're in and don't care what package | |
5f05dabc | 217 | they were invoked via. (These aren't necessarily the same, because |
55497cff | 218 | class methods follow the inheritance tree just like ordinary instance |
219 | methods.) Another typical use for class methods is to look up an | |
a0d0e21e LW |
220 | object by name: |
221 | ||
222 | sub find { | |
223 | my ($class, $name) = @_; | |
224 | $objtable{$name}; | |
225 | } | |
226 | ||
55497cff | 227 | An instance method expects an object reference as its first argument. |
a0d0e21e LW |
228 | Typically it shifts the first argument into a "self" or "this" variable, |
229 | and then uses that as an ordinary reference. | |
230 | ||
231 | sub display { | |
232 | my $self = shift; | |
233 | my @keys = @_ ? @_ : sort keys %$self; | |
234 | foreach $key (@keys) { | |
235 | print "\t$key => $self->{$key}\n"; | |
236 | } | |
237 | } | |
238 | ||
239 | =head2 Method Invocation | |
d74e8afc | 240 | X<invocation> X<method> X<arrow> X<< -> >> |
a0d0e21e | 241 | |
5d9f8747 IK |
242 | For various historical and other reasons, Perl offers two equivalent |
243 | ways to write a method call. The simpler and more common way is to use | |
244 | the arrow notation: | |
a0d0e21e | 245 | |
5d9f8747 IK |
246 | my $fred = Critter->find("Fred"); |
247 | $fred->display("Height", "Weight"); | |
a0d0e21e | 248 | |
5f7b1de2 | 249 | You should already be familiar with the use of the C<< -> >> operator with |
5d9f8747 IK |
250 | references. In fact, since C<$fred> above is a reference to an object, |
251 | you could think of the method call as just another form of | |
252 | dereferencing. | |
a0d0e21e | 253 | |
5d9f8747 IK |
254 | Whatever is on the left side of the arrow, whether a reference or a |
255 | class name, is passed to the method subroutine as its first argument. | |
256 | So the above code is mostly equivalent to: | |
a0d0e21e | 257 | |
5d9f8747 IK |
258 | my $fred = Critter::find("Critter", "Fred"); |
259 | Critter::display($fred, "Height", "Weight"); | |
a0d0e21e | 260 | |
5d9f8747 IK |
261 | How does Perl know which package the subroutine is in? By looking at |
262 | the left side of the arrow, which must be either a package name or a | |
263 | reference to an object, i.e. something that has been blessed to a | |
5f7b1de2 | 264 | package. Either way, that's the package where Perl starts looking. If |
5d9f8747 IK |
265 | that package has no subroutine with that name, Perl starts looking for |
266 | it in any base classes of that package, and so on. | |
a0d0e21e | 267 | |
5f7b1de2 | 268 | If you need to, you I<can> force Perl to start looking in some other package: |
a0d0e21e | 269 | |
5d9f8747 IK |
270 | my $barney = MyCritter->Critter::find("Barney"); |
271 | $barney->Critter::display("Height", "Weight"); | |
a0d0e21e | 272 | |
5d9f8747 IK |
273 | Here C<MyCritter> is presumably a subclass of C<Critter> that defines |
274 | its own versions of find() and display(). We haven't specified what | |
275 | those methods do, but that doesn't matter above since we've forced Perl | |
276 | to start looking for the subroutines in C<Critter>. | |
a0d0e21e | 277 | |
5d9f8747 | 278 | As a special case of the above, you may use the C<SUPER> pseudo-class to |
5f7b1de2 JH |
279 | tell Perl to start looking for the method in the packages named in the |
280 | current class's C<@ISA> list. | |
d74e8afc | 281 | X<SUPER> |
a0d0e21e | 282 | |
5d9f8747 IK |
283 | package MyCritter; |
284 | use base 'Critter'; # sets @MyCritter::ISA = ('Critter'); | |
a0d0e21e | 285 | |
5d9f8747 IK |
286 | sub display { |
287 | my ($self, @args) = @_; | |
288 | $self->SUPER::display("Name", @args); | |
289 | } | |
a0d0e21e | 290 | |
50506ccd DM |
291 | It is important to note that C<SUPER> refers to the superclass(es) of the |
292 | I<current package> and not to the superclass(es) of the object. Also, the | |
029f3b44 DM |
293 | C<SUPER> pseudo-class can only currently be used as a modifier to a method |
294 | name, but not in any of the other ways that class names are normally used, | |
295 | eg: | |
d74e8afc | 296 | X<SUPER> |
029f3b44 DM |
297 | |
298 | something->SUPER::method(...); # OK | |
299 | SUPER::method(...); # WRONG | |
300 | SUPER->method(...); # WRONG | |
301 | ||
5d9f8747 IK |
302 | Instead of a class name or an object reference, you can also use any |
303 | expression that returns either of those on the left side of the arrow. | |
304 | So the following statement is valid: | |
a0d0e21e | 305 | |
5d9f8747 | 306 | Critter->find("Fred")->display("Height", "Weight"); |
a0d0e21e | 307 | |
5f7b1de2 | 308 | and so is the following: |
cb1a09d0 | 309 | |
5d9f8747 | 310 | my $fred = (reverse "rettirC")->find(reverse "derF"); |
cb1a09d0 | 311 | |
d9693f5a SP |
312 | The right side of the arrow typically is the method name, but a simple |
313 | scalar variable containing either the method name or a subroutine | |
314 | reference can also be used. | |
315 | ||
e947c198 IG |
316 | If the right side of the arrow is a scalar containing a reference |
317 | to a subroutine, then this is equivalent to calling the referenced | |
318 | subroutine directly with the class name or object on the left side | |
319 | of the arrow as its first argument. No lookup is done and there is | |
320 | no requirement that the subroutine be defined in any package related | |
321 | to the class name or object on the left side of the arrow. | |
322 | ||
323 | For example, the following calls to $display are equivalent: | |
324 | ||
325 | my $display = sub { my $self = shift; ... }; | |
326 | $fred->$display("Height", "Weight"); | |
327 | $display->($fred, "Height", "Weight"); | |
328 | ||
5d9f8747 | 329 | =head2 Indirect Object Syntax |
d74e8afc | 330 | X<indirect object syntax> X<invocation, indirect> X<indirect> |
cb1a09d0 | 331 | |
5f7b1de2 JH |
332 | The other way to invoke a method is by using the so-called "indirect |
333 | object" notation. This syntax was available in Perl 4 long before | |
334 | objects were introduced, and is still used with filehandles like this: | |
748a9306 | 335 | |
5d9f8747 | 336 | print STDERR "help!!!\n"; |
19799a22 | 337 | |
5d9f8747 | 338 | The same syntax can be used to call either object or class methods. |
19799a22 | 339 | |
5d9f8747 IK |
340 | my $fred = find Critter "Fred"; |
341 | display $fred "Height", "Weight"; | |
19799a22 | 342 | |
5d9f8747 IK |
343 | Notice that there is no comma between the object or class name and the |
344 | parameters. This is how Perl can tell you want an indirect method call | |
345 | instead of an ordinary subroutine call. | |
19799a22 | 346 | |
5d9f8747 | 347 | But what if there are no arguments? In that case, Perl must guess what |
5f7b1de2 JH |
348 | you want. Even worse, it must make that guess I<at compile time>. |
349 | Usually Perl gets it right, but when it doesn't you get a function | |
350 | call compiled as a method, or vice versa. This can introduce subtle bugs | |
351 | that are hard to detect. | |
5d9f8747 | 352 | |
ac036724 | 353 | For example, a call to a method C<new> in indirect notation (as C++ |
354 | programmers are wont to make) can be miscompiled into a subroutine | |
5d9f8747 IK |
355 | call if there's already a C<new> function in scope. You'd end up |
356 | calling the current package's C<new> as a subroutine, rather than the | |
357 | desired class's method. The compiler tries to cheat by remembering | |
5f7b1de2 JH |
358 | bareword C<require>s, but the grief when it messes up just isn't worth the |
359 | years of debugging it will take you to track down such subtle bugs. | |
5d9f8747 IK |
360 | |
361 | There is another problem with this syntax: the indirect object is | |
362 | limited to a name, a scalar variable, or a block, because it would have | |
363 | to do too much lookahead otherwise, just like any other postfix | |
364 | dereference in the language. (These are the same quirky rules as are | |
365 | used for the filehandle slot in functions like C<print> and C<printf>.) | |
366 | This can lead to horribly confusing precedence problems, as in these | |
367 | next two lines: | |
19799a22 GS |
368 | |
369 | move $obj->{FIELD}; # probably wrong! | |
370 | move $ary[$i]; # probably wrong! | |
371 | ||
372 | Those actually parse as the very surprising: | |
373 | ||
374 | $obj->move->{FIELD}; # Well, lookee here | |
4f298f32 | 375 | $ary->move([$i]); # Didn't expect this one, eh? |
19799a22 GS |
376 | |
377 | Rather than what you might have expected: | |
378 | ||
379 | $obj->{FIELD}->move(); # You should be so lucky. | |
380 | $ary[$i]->move; # Yeah, sure. | |
381 | ||
5d9f8747 IK |
382 | To get the correct behavior with indirect object syntax, you would have |
383 | to use a block around the indirect object: | |
19799a22 | 384 | |
5d9f8747 IK |
385 | move {$obj->{FIELD}}; |
386 | move {$ary[$i]}; | |
387 | ||
388 | Even then, you still have the same potential problem if there happens to | |
389 | be a function named C<move> in the current package. B<The C<< -> >> | |
390 | notation suffers from neither of these disturbing ambiguities, so we | |
391 | recommend you use it exclusively.> However, you may still end up having | |
392 | to read code using the indirect object notation, so it's important to be | |
393 | familiar with it. | |
748a9306 | 394 | |
a2bdc9a5 | 395 | =head2 Default UNIVERSAL methods |
d74e8afc | 396 | X<UNIVERSAL> |
a2bdc9a5 | 397 | |
398 | The C<UNIVERSAL> package automatically contains the following methods that | |
399 | are inherited by all other classes: | |
400 | ||
401 | =over 4 | |
402 | ||
71be2cbc | 403 | =item isa(CLASS) |
d74e8afc | 404 | X<isa> |
a2bdc9a5 | 405 | |
68dc0745 | 406 | C<isa> returns I<true> if its object is blessed into a subclass of C<CLASS> |
a2bdc9a5 | 407 | |
bcb8f0e8 | 408 | =item DOES(ROLE) |
003db2bd | 409 | X<DOES> |
bcb8f0e8 | 410 | |
003db2bd RS |
411 | C<DOES> returns I<true> if its object claims to perform the role C<ROLE>. By |
412 | default, this is equivalent to C<isa>. | |
bcb8f0e8 | 413 | |
71be2cbc | 414 | =item can(METHOD) |
d74e8afc | 415 | X<can> |
a2bdc9a5 | 416 | |
417 | C<can> checks to see if its object has a method called C<METHOD>, | |
418 | if it does then a reference to the sub is returned, if it does not then | |
003db2bd | 419 | C<undef> is returned. |
b32b0a5d | 420 | |
71be2cbc | 421 | =item VERSION( [NEED] ) |
d74e8afc | 422 | X<VERSION> |
760ac839 | 423 | |
71be2cbc | 424 | C<VERSION> returns the version number of the class (package). If the |
425 | NEED argument is given then it will check that the current version (as | |
426 | defined by the $VERSION variable in the given package) not less than | |
003db2bd RS |
427 | NEED; it will die if this is not the case. This method is called automatically |
428 | by the C<VERSION> form of C<use>. | |
a2bdc9a5 | 429 | |
003db2bd | 430 | use Package 1.2 qw(some imported subs); |
71be2cbc | 431 | # implies: |
003db2bd | 432 | Package->VERSION(1.2); |
a2bdc9a5 | 433 | |
a2bdc9a5 | 434 | =back |
435 | ||
54310121 | 436 | =head2 Destructors |
d74e8afc | 437 | X<destructor> X<DESTROY> |
a0d0e21e LW |
438 | |
439 | When the last reference to an object goes away, the object is | |
440 | automatically destroyed. (This may even be after you exit, if you've | |
441 | stored references in global variables.) If you want to capture control | |
442 | just before the object is freed, you may define a DESTROY method in | |
443 | your class. It will automatically be called at the appropriate moment, | |
4e8e7886 GS |
444 | and you can do any extra cleanup you need to do. Perl passes a reference |
445 | to the object under destruction as the first (and only) argument. Beware | |
446 | that the reference is a read-only value, and cannot be modified by | |
447 | manipulating C<$_[0]> within the destructor. The object itself (i.e. | |
448 | the thingy the reference points to, namely C<${$_[0]}>, C<@{$_[0]}>, | |
449 | C<%{$_[0]}> etc.) is not similarly constrained. | |
450 | ||
f4551fcd MG |
451 | Since DESTROY methods can be called at unpredictable times, it is |
452 | important that you localise any global variables that the method may | |
453 | update. In particular, localise C<$@> if you use C<eval {}> and | |
454 | localise C<$?> if you use C<system> or backticks. | |
455 | ||
4e8e7886 GS |
456 | If you arrange to re-bless the reference before the destructor returns, |
457 | perl will again call the DESTROY method for the re-blessed object after | |
458 | the current one returns. This can be used for clean delegation of | |
459 | object destruction, or for ensuring that destructors in the base classes | |
460 | of your choosing get called. Explicitly calling DESTROY is also possible, | |
461 | but is usually never needed. | |
462 | ||
a35666c6 NC |
463 | DESTROY is subject to AUTOLOAD lookup, just like any other method. Hence, if |
464 | your class has an AUTOLOAD method, but does not need any DESTROY actions, | |
465 | you probably want to provide a DESTROY method anyway, to prevent an | |
466 | expensive call to AUTOLOAD each time an object is freed. As this technique | |
467 | makes empty DESTROY methods common, the implementation is optimised so that | |
468 | a DESTROY method that is an empty or constant subroutine, and hence could | |
469 | have no side effects anyway, is not actually called. | |
470 | X<AUTOLOAD> X<DESTROY> | |
471 | ||
14218588 | 472 | Do not confuse the previous discussion with how objects I<CONTAINED> in the current |
4e8e7886 GS |
473 | one are destroyed. Such objects will be freed and destroyed automatically |
474 | when the current object is freed, provided no other references to them exist | |
475 | elsewhere. | |
a0d0e21e LW |
476 | |
477 | =head2 Summary | |
478 | ||
5f05dabc | 479 | That's about all there is to it. Now you need just to go off and buy a |
a0d0e21e LW |
480 | book about object-oriented design methodology, and bang your forehead |
481 | with it for the next six months or so. | |
482 | ||
cb1a09d0 | 483 | =head2 Two-Phased Garbage Collection |
d74e8afc ITB |
484 | X<garbage collection> X<GC> X<circular reference> |
485 | X<reference, circular> X<DESTROY> X<destructor> | |
cb1a09d0 | 486 | |
14218588 GS |
487 | For most purposes, Perl uses a fast and simple, reference-based |
488 | garbage collection system. That means there's an extra | |
cb1a09d0 AD |
489 | dereference going on at some level, so if you haven't built |
490 | your Perl executable using your C compiler's C<-O> flag, performance | |
491 | will suffer. If you I<have> built Perl with C<cc -O>, then this | |
492 | probably won't matter. | |
493 | ||
494 | A more serious concern is that unreachable memory with a non-zero | |
495 | reference count will not normally get freed. Therefore, this is a bad | |
54310121 | 496 | idea: |
cb1a09d0 AD |
497 | |
498 | { | |
499 | my $a; | |
500 | $a = \$a; | |
54310121 | 501 | } |
cb1a09d0 AD |
502 | |
503 | Even thought $a I<should> go away, it can't. When building recursive data | |
504 | structures, you'll have to break the self-reference yourself explicitly | |
505 | if you don't care to leak. For example, here's a self-referential | |
506 | node such as one might use in a sophisticated tree structure: | |
507 | ||
508 | sub new_node { | |
eac7fe86 CP |
509 | my $class = shift; |
510 | my $node = {}; | |
cb1a09d0 AD |
511 | $node->{LEFT} = $node->{RIGHT} = $node; |
512 | $node->{DATA} = [ @_ ]; | |
513 | return bless $node => $class; | |
54310121 | 514 | } |
cb1a09d0 AD |
515 | |
516 | If you create nodes like that, they (currently) won't go away unless you | |
517 | break their self reference yourself. (In other words, this is not to be | |
518 | construed as a feature, and you shouldn't depend on it.) | |
519 | ||
520 | Almost. | |
521 | ||
522 | When an interpreter thread finally shuts down (usually when your program | |
523 | exits), then a rather costly but complete mark-and-sweep style of garbage | |
524 | collection is performed, and everything allocated by that thread gets | |
525 | destroyed. This is essential to support Perl as an embedded or a | |
54310121 | 526 | multithreadable language. For example, this program demonstrates Perl's |
cb1a09d0 AD |
527 | two-phased garbage collection: |
528 | ||
54310121 | 529 | #!/usr/bin/perl |
cb1a09d0 AD |
530 | package Subtle; |
531 | ||
532 | sub new { | |
533 | my $test; | |
534 | $test = \$test; | |
535 | warn "CREATING " . \$test; | |
536 | return bless \$test; | |
54310121 | 537 | } |
cb1a09d0 AD |
538 | |
539 | sub DESTROY { | |
540 | my $self = shift; | |
541 | warn "DESTROYING $self"; | |
54310121 | 542 | } |
cb1a09d0 AD |
543 | |
544 | package main; | |
545 | ||
546 | warn "starting program"; | |
547 | { | |
548 | my $a = Subtle->new; | |
549 | my $b = Subtle->new; | |
550 | $$a = 0; # break selfref | |
551 | warn "leaving block"; | |
54310121 | 552 | } |
cb1a09d0 AD |
553 | |
554 | warn "just exited block"; | |
555 | warn "time to die..."; | |
556 | exit; | |
557 | ||
2359510d SD |
558 | When run as F</foo/test>, the following output is produced: |
559 | ||
560 | starting program at /foo/test line 18. | |
561 | CREATING SCALAR(0x8e5b8) at /foo/test line 7. | |
562 | CREATING SCALAR(0x8e57c) at /foo/test line 7. | |
563 | leaving block at /foo/test line 23. | |
564 | DESTROYING Subtle=SCALAR(0x8e5b8) at /foo/test line 13. | |
565 | just exited block at /foo/test line 26. | |
566 | time to die... at /foo/test line 27. | |
cb1a09d0 AD |
567 | DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. |
568 | ||
569 | Notice that "global destruction" bit there? That's the thread | |
54310121 | 570 | garbage collector reaching the unreachable. |
cb1a09d0 | 571 | |
14218588 GS |
572 | Objects are always destructed, even when regular refs aren't. Objects |
573 | are destructed in a separate pass before ordinary refs just to | |
cb1a09d0 | 574 | prevent object destructors from using refs that have been themselves |
5f05dabc | 575 | destructed. Plain refs are only garbage-collected if the destruct level |
cb1a09d0 AD |
576 | is greater than 0. You can test the higher levels of global destruction |
577 | by setting the PERL_DESTRUCT_LEVEL environment variable, presuming | |
578 | C<-DDEBUGGING> was enabled during perl build time. | |
96090e4f | 579 | See L<perlhacktips/PERL_DESTRUCT_LEVEL> for more information. |
cb1a09d0 AD |
580 | |
581 | A more complete garbage collection strategy will be implemented | |
582 | at a future date. | |
583 | ||
5a964f20 TC |
584 | In the meantime, the best solution is to create a non-recursive container |
585 | class that holds a pointer to the self-referential data structure. | |
586 | Define a DESTROY method for the containing object's class that manually | |
587 | breaks the circularities in the self-referential structure. | |
588 | ||
a0d0e21e LW |
589 | =head1 SEE ALSO |
590 | ||
8257a158 | 591 | A kinder, gentler tutorial on object-oriented programming in Perl can |
890a53b9 | 592 | be found in L<perltoot>, L<perlboot> and L<perltooc>. You should |
8257a158 MS |
593 | also check out L<perlbot> for other object tricks, traps, and tips, as |
594 | well as L<perlmodlib> for some style guides on constructing both | |
595 | modules and classes. |