Commit | Line | Data |
---|---|---|
a0d0e21e | 1 | =head1 NAME |
d74e8afc | 2 | X<object> X<OOP> |
a0d0e21e LW |
3 | |
4 | perlobj - Perl objects | |
5 | ||
6 | =head1 DESCRIPTION | |
7 | ||
14218588 | 8 | First you need to understand what references are in Perl. |
5f05dabc | 9 | See L<perlref> for that. Second, if you still find the following |
10 | reference work too complicated, a tutorial on object-oriented programming | |
890a53b9 | 11 | in Perl can be found in L<perltoot> and L<perltooc>. |
a0d0e21e | 12 | |
54310121 | 13 | If you're still with us, then |
5f05dabc | 14 | here are three very simple definitions that you should find reassuring. |
a0d0e21e LW |
15 | |
16 | =over 4 | |
17 | ||
18 | =item 1. | |
19 | ||
20 | An object is simply a reference that happens to know which class it | |
21 | belongs to. | |
22 | ||
23 | =item 2. | |
24 | ||
25 | A class is simply a package that happens to provide methods to deal | |
26 | with object references. | |
27 | ||
28 | =item 3. | |
29 | ||
30 | A method is simply a subroutine that expects an object reference (or | |
55497cff | 31 | a package name, for class methods) as the first argument. |
a0d0e21e LW |
32 | |
33 | =back | |
34 | ||
35 | We'll cover these points now in more depth. | |
36 | ||
37 | =head2 An Object is Simply a Reference | |
d74e8afc | 38 | X<object> X<bless> X<constructor> X<new> |
a0d0e21e LW |
39 | |
40 | Unlike say C++, Perl doesn't provide any special syntax for | |
41 | constructors. A constructor is merely a subroutine that returns a | |
cb1a09d0 | 42 | reference to something "blessed" into a class, generally the |
a0d0e21e LW |
43 | class that the subroutine is defined in. Here is a typical |
44 | constructor: | |
45 | ||
46 | package Critter; | |
47 | sub new { bless {} } | |
48 | ||
5a964f20 TC |
49 | That word C<new> isn't special. You could have written |
50 | a construct this way, too: | |
51 | ||
52 | package Critter; | |
53 | sub spawn { bless {} } | |
54 | ||
14218588 | 55 | This might even be preferable, because the C++ programmers won't |
5a964f20 TC |
56 | be tricked into thinking that C<new> works in Perl as it does in C++. |
57 | It doesn't. We recommend that you name your constructors whatever | |
58 | makes sense in the context of the problem you're solving. For example, | |
59 | constructors in the Tk extension to Perl are named after the widgets | |
60 | they create. | |
61 | ||
62 | One thing that's different about Perl constructors compared with those in | |
63 | C++ is that in Perl, they have to allocate their own memory. (The other | |
64 | things is that they don't automatically call overridden base-class | |
65 | constructors.) The C<{}> allocates an anonymous hash containing no | |
66 | key/value pairs, and returns it The bless() takes that reference and | |
67 | tells the object it references that it's now a Critter, and returns | |
68 | the reference. This is for convenience, because the referenced object | |
69 | itself knows that it has been blessed, and the reference to it could | |
70 | have been returned directly, like this: | |
a0d0e21e LW |
71 | |
72 | sub new { | |
73 | my $self = {}; | |
74 | bless $self; | |
75 | return $self; | |
76 | } | |
77 | ||
14218588 | 78 | You often see such a thing in more complicated constructors |
a0d0e21e LW |
79 | that wish to call methods in the class as part of the construction: |
80 | ||
81 | sub new { | |
5a964f20 | 82 | my $self = {}; |
a0d0e21e LW |
83 | bless $self; |
84 | $self->initialize(); | |
cb1a09d0 AD |
85 | return $self; |
86 | } | |
87 | ||
1fef88e7 | 88 | If you care about inheritance (and you should; see |
b687b08b | 89 | L<perlmodlib/"Modules: Creation, Use, and Abuse">), |
1fef88e7 | 90 | then you want to use the two-arg form of bless |
cb1a09d0 AD |
91 | so that your constructors may be inherited: |
92 | ||
93 | sub new { | |
94 | my $class = shift; | |
95 | my $self = {}; | |
5a964f20 | 96 | bless $self, $class; |
cb1a09d0 AD |
97 | $self->initialize(); |
98 | return $self; | |
99 | } | |
100 | ||
c47ff5f1 | 101 | Or if you expect people to call not just C<< CLASS->new() >> but also |
eac7fe86 CP |
102 | C<< $obj->new() >>, then use something like the following. (Note that using |
103 | this to call new() on an instance does not automatically perform any | |
104 | copying. If you want a shallow or deep copy of an object, you'll have to | |
105 | specifically allow for that.) The initialize() method used will be of | |
106 | whatever $class we blessed the object into: | |
cb1a09d0 AD |
107 | |
108 | sub new { | |
109 | my $this = shift; | |
110 | my $class = ref($this) || $this; | |
111 | my $self = {}; | |
5a964f20 | 112 | bless $self, $class; |
cb1a09d0 AD |
113 | $self->initialize(); |
114 | return $self; | |
a0d0e21e LW |
115 | } |
116 | ||
117 | Within the class package, the methods will typically deal with the | |
118 | reference as an ordinary reference. Outside the class package, | |
119 | the reference is generally treated as an opaque value that may | |
5f05dabc | 120 | be accessed only through the class's methods. |
a0d0e21e | 121 | |
14218588 | 122 | Although a constructor can in theory re-bless a referenced object |
19799a22 GS |
123 | currently belonging to another class, this is almost certainly going |
124 | to get you into trouble. The new class is responsible for all | |
125 | cleanup later. The previous blessing is forgotten, as an object | |
126 | may belong to only one class at a time. (Although of course it's | |
127 | free to inherit methods from many classes.) If you find yourself | |
128 | having to do this, the parent class is probably misbehaving, though. | |
a0d0e21e LW |
129 | |
130 | A clarification: Perl objects are blessed. References are not. Objects | |
131 | know which package they belong to. References do not. The bless() | |
5f05dabc | 132 | function uses the reference to find the object. Consider |
a0d0e21e LW |
133 | the following example: |
134 | ||
135 | $a = {}; | |
136 | $b = $a; | |
137 | bless $a, BLAH; | |
138 | print "\$b is a ", ref($b), "\n"; | |
139 | ||
54310121 | 140 | This reports $b as being a BLAH, so obviously bless() |
a0d0e21e LW |
141 | operated on the object and not on the reference. |
142 | ||
143 | =head2 A Class is Simply a Package | |
d74e8afc | 144 | X<class> X<package> X<@ISA> X<inheritance> |
a0d0e21e LW |
145 | |
146 | Unlike say C++, Perl doesn't provide any special syntax for class | |
5f05dabc | 147 | definitions. You use a package as a class by putting method |
a0d0e21e LW |
148 | definitions into the class. |
149 | ||
5a964f20 | 150 | There is a special array within each package called @ISA, which says |
a0d0e21e LW |
151 | where else to look for a method if you can't find it in the current |
152 | package. This is how Perl implements inheritance. Each element of the | |
153 | @ISA array is just the name of another package that happens to be a | |
154 | class package. The classes are searched (depth first) for missing | |
155 | methods in the order that they occur in @ISA. The classes accessible | |
54310121 | 156 | through @ISA are known as base classes of the current class. |
a0d0e21e | 157 | |
5a964f20 TC |
158 | All classes implicitly inherit from class C<UNIVERSAL> as their |
159 | last base class. Several commonly used methods are automatically | |
160 | supplied in the UNIVERSAL class; see L<"Default UNIVERSAL methods"> for | |
161 | more details. | |
d74e8afc | 162 | X<UNIVERSAL> X<base class> X<class, base> |
5a964f20 | 163 | |
14218588 | 164 | If a missing method is found in a base class, it is cached |
a0d0e21e LW |
165 | in the current class for efficiency. Changing @ISA or defining new |
166 | subroutines invalidates the cache and causes Perl to do the lookup again. | |
167 | ||
5a964f20 TC |
168 | If neither the current class, its named base classes, nor the UNIVERSAL |
169 | class contains the requested method, these three places are searched | |
170 | all over again, this time looking for a method named AUTOLOAD(). If an | |
171 | AUTOLOAD is found, this method is called on behalf of the missing method, | |
172 | setting the package global $AUTOLOAD to be the fully qualified name of | |
173 | the method that was intended to be called. | |
d74e8afc | 174 | X<AUTOLOAD> |
5a964f20 TC |
175 | |
176 | If none of that works, Perl finally gives up and complains. | |
177 | ||
ed850460 | 178 | If you want to stop the AUTOLOAD inheritance say simply |
d74e8afc | 179 | X<AUTOLOAD> |
ed850460 JH |
180 | |
181 | sub AUTOLOAD; | |
182 | ||
183 | and the call will die using the name of the sub being called. | |
184 | ||
5a964f20 TC |
185 | Perl classes do method inheritance only. Data inheritance is left up |
186 | to the class itself. By and large, this is not a problem in Perl, | |
187 | because most classes model the attributes of their object using an | |
188 | anonymous hash, which serves as its own little namespace to be carved up | |
189 | by the various classes that might want to do something with the object. | |
190 | The only problem with this is that you can't sure that you aren't using | |
191 | a piece of the hash that isn't already used. A reasonable workaround | |
192 | is to prepend your fieldname in the hash with the package name. | |
d74e8afc | 193 | X<inheritance, method> X<inheritance, data> |
5a964f20 TC |
194 | |
195 | sub bump { | |
196 | my $self = shift; | |
197 | $self->{ __PACKAGE__ . ".count"}++; | |
198 | } | |
a0d0e21e LW |
199 | |
200 | =head2 A Method is Simply a Subroutine | |
d74e8afc | 201 | X<method> |
a0d0e21e LW |
202 | |
203 | Unlike say C++, Perl doesn't provide any special syntax for method | |
204 | definition. (It does provide a little syntax for method invocation | |
205 | though. More on that later.) A method expects its first argument | |
19799a22 GS |
206 | to be the object (reference) or package (string) it is being invoked |
207 | on. There are two ways of calling methods, which we'll call class | |
208 | methods and instance methods. | |
a0d0e21e | 209 | |
55497cff | 210 | A class method expects a class name as the first argument. It |
19799a22 GS |
211 | provides functionality for the class as a whole, not for any |
212 | individual object belonging to the class. Constructors are often | |
890a53b9 | 213 | class methods, but see L<perltoot> and L<perltooc> for alternatives. |
19799a22 GS |
214 | Many class methods simply ignore their first argument, because they |
215 | already know what package they're in and don't care what package | |
5f05dabc | 216 | they were invoked via. (These aren't necessarily the same, because |
55497cff | 217 | class methods follow the inheritance tree just like ordinary instance |
218 | methods.) Another typical use for class methods is to look up an | |
a0d0e21e LW |
219 | object by name: |
220 | ||
221 | sub find { | |
222 | my ($class, $name) = @_; | |
223 | $objtable{$name}; | |
224 | } | |
225 | ||
55497cff | 226 | An instance method expects an object reference as its first argument. |
a0d0e21e LW |
227 | Typically it shifts the first argument into a "self" or "this" variable, |
228 | and then uses that as an ordinary reference. | |
229 | ||
230 | sub display { | |
231 | my $self = shift; | |
232 | my @keys = @_ ? @_ : sort keys %$self; | |
233 | foreach $key (@keys) { | |
234 | print "\t$key => $self->{$key}\n"; | |
235 | } | |
236 | } | |
237 | ||
238 | =head2 Method Invocation | |
d74e8afc | 239 | X<invocation> X<method> X<arrow> X<< -> >> |
a0d0e21e | 240 | |
5d9f8747 IK |
241 | For various historical and other reasons, Perl offers two equivalent |
242 | ways to write a method call. The simpler and more common way is to use | |
243 | the arrow notation: | |
a0d0e21e | 244 | |
5d9f8747 IK |
245 | my $fred = Critter->find("Fred"); |
246 | $fred->display("Height", "Weight"); | |
a0d0e21e | 247 | |
5f7b1de2 | 248 | You should already be familiar with the use of the C<< -> >> operator with |
5d9f8747 IK |
249 | references. In fact, since C<$fred> above is a reference to an object, |
250 | you could think of the method call as just another form of | |
251 | dereferencing. | |
a0d0e21e | 252 | |
5d9f8747 IK |
253 | Whatever is on the left side of the arrow, whether a reference or a |
254 | class name, is passed to the method subroutine as its first argument. | |
255 | So the above code is mostly equivalent to: | |
a0d0e21e | 256 | |
5d9f8747 IK |
257 | my $fred = Critter::find("Critter", "Fred"); |
258 | Critter::display($fred, "Height", "Weight"); | |
a0d0e21e | 259 | |
5d9f8747 IK |
260 | How does Perl know which package the subroutine is in? By looking at |
261 | the left side of the arrow, which must be either a package name or a | |
262 | reference to an object, i.e. something that has been blessed to a | |
5f7b1de2 | 263 | package. Either way, that's the package where Perl starts looking. If |
5d9f8747 IK |
264 | that package has no subroutine with that name, Perl starts looking for |
265 | it in any base classes of that package, and so on. | |
a0d0e21e | 266 | |
5f7b1de2 | 267 | If you need to, you I<can> force Perl to start looking in some other package: |
a0d0e21e | 268 | |
5d9f8747 IK |
269 | my $barney = MyCritter->Critter::find("Barney"); |
270 | $barney->Critter::display("Height", "Weight"); | |
a0d0e21e | 271 | |
5d9f8747 IK |
272 | Here C<MyCritter> is presumably a subclass of C<Critter> that defines |
273 | its own versions of find() and display(). We haven't specified what | |
274 | those methods do, but that doesn't matter above since we've forced Perl | |
275 | to start looking for the subroutines in C<Critter>. | |
a0d0e21e | 276 | |
5d9f8747 | 277 | As a special case of the above, you may use the C<SUPER> pseudo-class to |
5f7b1de2 JH |
278 | tell Perl to start looking for the method in the packages named in the |
279 | current class's C<@ISA> list. | |
d74e8afc | 280 | X<SUPER> |
a0d0e21e | 281 | |
5d9f8747 IK |
282 | package MyCritter; |
283 | use base 'Critter'; # sets @MyCritter::ISA = ('Critter'); | |
a0d0e21e | 284 | |
5d9f8747 IK |
285 | sub display { |
286 | my ($self, @args) = @_; | |
287 | $self->SUPER::display("Name", @args); | |
288 | } | |
a0d0e21e | 289 | |
50506ccd DM |
290 | It is important to note that C<SUPER> refers to the superclass(es) of the |
291 | I<current package> and not to the superclass(es) of the object. Also, the | |
029f3b44 DM |
292 | C<SUPER> pseudo-class can only currently be used as a modifier to a method |
293 | name, but not in any of the other ways that class names are normally used, | |
294 | eg: | |
d74e8afc | 295 | X<SUPER> |
029f3b44 DM |
296 | |
297 | something->SUPER::method(...); # OK | |
298 | SUPER::method(...); # WRONG | |
299 | SUPER->method(...); # WRONG | |
300 | ||
5d9f8747 IK |
301 | Instead of a class name or an object reference, you can also use any |
302 | expression that returns either of those on the left side of the arrow. | |
303 | So the following statement is valid: | |
a0d0e21e | 304 | |
5d9f8747 | 305 | Critter->find("Fred")->display("Height", "Weight"); |
a0d0e21e | 306 | |
5f7b1de2 | 307 | and so is the following: |
cb1a09d0 | 308 | |
5d9f8747 | 309 | my $fred = (reverse "rettirC")->find(reverse "derF"); |
cb1a09d0 | 310 | |
d9693f5a SP |
311 | The right side of the arrow typically is the method name, but a simple |
312 | scalar variable containing either the method name or a subroutine | |
313 | reference can also be used. | |
314 | ||
5d9f8747 | 315 | =head2 Indirect Object Syntax |
d74e8afc | 316 | X<indirect object syntax> X<invocation, indirect> X<indirect> |
cb1a09d0 | 317 | |
5f7b1de2 JH |
318 | The other way to invoke a method is by using the so-called "indirect |
319 | object" notation. This syntax was available in Perl 4 long before | |
320 | objects were introduced, and is still used with filehandles like this: | |
748a9306 | 321 | |
5d9f8747 | 322 | print STDERR "help!!!\n"; |
19799a22 | 323 | |
5d9f8747 | 324 | The same syntax can be used to call either object or class methods. |
19799a22 | 325 | |
5d9f8747 IK |
326 | my $fred = find Critter "Fred"; |
327 | display $fred "Height", "Weight"; | |
19799a22 | 328 | |
5d9f8747 IK |
329 | Notice that there is no comma between the object or class name and the |
330 | parameters. This is how Perl can tell you want an indirect method call | |
331 | instead of an ordinary subroutine call. | |
19799a22 | 332 | |
5d9f8747 | 333 | But what if there are no arguments? In that case, Perl must guess what |
5f7b1de2 JH |
334 | you want. Even worse, it must make that guess I<at compile time>. |
335 | Usually Perl gets it right, but when it doesn't you get a function | |
336 | call compiled as a method, or vice versa. This can introduce subtle bugs | |
337 | that are hard to detect. | |
5d9f8747 | 338 | |
5f7b1de2 JH |
339 | For example, a call to a method C<new> in indirect notation -- as C++ |
340 | programmers are wont to make -- can be miscompiled into a subroutine | |
5d9f8747 IK |
341 | call if there's already a C<new> function in scope. You'd end up |
342 | calling the current package's C<new> as a subroutine, rather than the | |
343 | desired class's method. The compiler tries to cheat by remembering | |
5f7b1de2 JH |
344 | bareword C<require>s, but the grief when it messes up just isn't worth the |
345 | years of debugging it will take you to track down such subtle bugs. | |
5d9f8747 IK |
346 | |
347 | There is another problem with this syntax: the indirect object is | |
348 | limited to a name, a scalar variable, or a block, because it would have | |
349 | to do too much lookahead otherwise, just like any other postfix | |
350 | dereference in the language. (These are the same quirky rules as are | |
351 | used for the filehandle slot in functions like C<print> and C<printf>.) | |
352 | This can lead to horribly confusing precedence problems, as in these | |
353 | next two lines: | |
19799a22 GS |
354 | |
355 | move $obj->{FIELD}; # probably wrong! | |
356 | move $ary[$i]; # probably wrong! | |
357 | ||
358 | Those actually parse as the very surprising: | |
359 | ||
360 | $obj->move->{FIELD}; # Well, lookee here | |
4f298f32 | 361 | $ary->move([$i]); # Didn't expect this one, eh? |
19799a22 GS |
362 | |
363 | Rather than what you might have expected: | |
364 | ||
365 | $obj->{FIELD}->move(); # You should be so lucky. | |
366 | $ary[$i]->move; # Yeah, sure. | |
367 | ||
5d9f8747 IK |
368 | To get the correct behavior with indirect object syntax, you would have |
369 | to use a block around the indirect object: | |
19799a22 | 370 | |
5d9f8747 IK |
371 | move {$obj->{FIELD}}; |
372 | move {$ary[$i]}; | |
373 | ||
374 | Even then, you still have the same potential problem if there happens to | |
375 | be a function named C<move> in the current package. B<The C<< -> >> | |
376 | notation suffers from neither of these disturbing ambiguities, so we | |
377 | recommend you use it exclusively.> However, you may still end up having | |
378 | to read code using the indirect object notation, so it's important to be | |
379 | familiar with it. | |
748a9306 | 380 | |
a2bdc9a5 | 381 | =head2 Default UNIVERSAL methods |
d74e8afc | 382 | X<UNIVERSAL> |
a2bdc9a5 | 383 | |
384 | The C<UNIVERSAL> package automatically contains the following methods that | |
385 | are inherited by all other classes: | |
386 | ||
387 | =over 4 | |
388 | ||
71be2cbc | 389 | =item isa(CLASS) |
d74e8afc | 390 | X<isa> |
a2bdc9a5 | 391 | |
68dc0745 | 392 | C<isa> returns I<true> if its object is blessed into a subclass of C<CLASS> |
a2bdc9a5 | 393 | |
da279afe | 394 | You can also call C<UNIVERSAL::isa> as a subroutine with two arguments. Of |
395 | course, this will do the wrong thing if someone has overridden C<isa> in a | |
396 | class, so don't do it. | |
a2bdc9a5 | 397 | |
da279afe | 398 | If you need to determine whether you've received a valid invocant, use the |
399 | C<blessed> function from L<Scalar::Util>: | |
d74e8afc | 400 | X<invocant> X<blessed> |
a2bdc9a5 | 401 | |
da279afe | 402 | if (blessed($ref) && $ref->isa( 'Some::Class')) { |
403 | # ... | |
404 | } | |
3189d65a | 405 | |
da279afe | 406 | C<blessed> returns the name of the package the argument has been |
407 | blessed into, or C<undef>. | |
3189d65a | 408 | |
71be2cbc | 409 | =item can(METHOD) |
d74e8afc | 410 | X<can> |
a2bdc9a5 | 411 | |
412 | C<can> checks to see if its object has a method called C<METHOD>, | |
413 | if it does then a reference to the sub is returned, if it does not then | |
414 | I<undef> is returned. | |
415 | ||
da279afe | 416 | C<UNIVERSAL::can> can also be called as a subroutine with two arguments. It'll |
417 | always return I<undef> if its first argument isn't an object or a class name. | |
418 | The same caveats for calling C<UNIVERSAL::isa> directly apply here, too. | |
b32b0a5d | 419 | |
71be2cbc | 420 | =item VERSION( [NEED] ) |
d74e8afc | 421 | X<VERSION> |
760ac839 | 422 | |
71be2cbc | 423 | C<VERSION> returns the version number of the class (package). If the |
424 | NEED argument is given then it will check that the current version (as | |
425 | defined by the $VERSION variable in the given package) not less than | |
426 | NEED; it will die if this is not the case. This method is normally | |
427 | called as a class method. This method is called automatically by the | |
428 | C<VERSION> form of C<use>. | |
a2bdc9a5 | 429 | |
a2bdc9a5 | 430 | use A 1.2 qw(some imported subs); |
71be2cbc | 431 | # implies: |
432 | A->VERSION(1.2); | |
a2bdc9a5 | 433 | |
a2bdc9a5 | 434 | =back |
435 | ||
436 | B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and | |
437 | C<isa> uses a very similar method and cache-ing strategy. This may cause | |
438 | strange effects if the Perl code dynamically changes @ISA in any package. | |
439 | ||
440 | You may add other methods to the UNIVERSAL class via Perl or XS code. | |
14218588 | 441 | You do not need to C<use UNIVERSAL> to make these methods |
38242c00 | 442 | available to your program (and you should not do so). |
a2bdc9a5 | 443 | |
54310121 | 444 | =head2 Destructors |
d74e8afc | 445 | X<destructor> X<DESTROY> |
a0d0e21e LW |
446 | |
447 | When the last reference to an object goes away, the object is | |
448 | automatically destroyed. (This may even be after you exit, if you've | |
449 | stored references in global variables.) If you want to capture control | |
450 | just before the object is freed, you may define a DESTROY method in | |
451 | your class. It will automatically be called at the appropriate moment, | |
4e8e7886 GS |
452 | and you can do any extra cleanup you need to do. Perl passes a reference |
453 | to the object under destruction as the first (and only) argument. Beware | |
454 | that the reference is a read-only value, and cannot be modified by | |
455 | manipulating C<$_[0]> within the destructor. The object itself (i.e. | |
456 | the thingy the reference points to, namely C<${$_[0]}>, C<@{$_[0]}>, | |
457 | C<%{$_[0]}> etc.) is not similarly constrained. | |
458 | ||
f4551fcd MG |
459 | Since DESTROY methods can be called at unpredictable times, it is |
460 | important that you localise any global variables that the method may | |
461 | update. In particular, localise C<$@> if you use C<eval {}> and | |
462 | localise C<$?> if you use C<system> or backticks. | |
463 | ||
4e8e7886 GS |
464 | If you arrange to re-bless the reference before the destructor returns, |
465 | perl will again call the DESTROY method for the re-blessed object after | |
466 | the current one returns. This can be used for clean delegation of | |
467 | object destruction, or for ensuring that destructors in the base classes | |
468 | of your choosing get called. Explicitly calling DESTROY is also possible, | |
469 | but is usually never needed. | |
470 | ||
14218588 | 471 | Do not confuse the previous discussion with how objects I<CONTAINED> in the current |
4e8e7886 GS |
472 | one are destroyed. Such objects will be freed and destroyed automatically |
473 | when the current object is freed, provided no other references to them exist | |
474 | elsewhere. | |
a0d0e21e LW |
475 | |
476 | =head2 Summary | |
477 | ||
5f05dabc | 478 | That's about all there is to it. Now you need just to go off and buy a |
a0d0e21e LW |
479 | book about object-oriented design methodology, and bang your forehead |
480 | with it for the next six months or so. | |
481 | ||
cb1a09d0 | 482 | =head2 Two-Phased Garbage Collection |
d74e8afc ITB |
483 | X<garbage collection> X<GC> X<circular reference> |
484 | X<reference, circular> X<DESTROY> X<destructor> | |
cb1a09d0 | 485 | |
14218588 GS |
486 | For most purposes, Perl uses a fast and simple, reference-based |
487 | garbage collection system. That means there's an extra | |
cb1a09d0 AD |
488 | dereference going on at some level, so if you haven't built |
489 | your Perl executable using your C compiler's C<-O> flag, performance | |
490 | will suffer. If you I<have> built Perl with C<cc -O>, then this | |
491 | probably won't matter. | |
492 | ||
493 | A more serious concern is that unreachable memory with a non-zero | |
494 | reference count will not normally get freed. Therefore, this is a bad | |
54310121 | 495 | idea: |
cb1a09d0 AD |
496 | |
497 | { | |
498 | my $a; | |
499 | $a = \$a; | |
54310121 | 500 | } |
cb1a09d0 AD |
501 | |
502 | Even thought $a I<should> go away, it can't. When building recursive data | |
503 | structures, you'll have to break the self-reference yourself explicitly | |
504 | if you don't care to leak. For example, here's a self-referential | |
505 | node such as one might use in a sophisticated tree structure: | |
506 | ||
507 | sub new_node { | |
eac7fe86 CP |
508 | my $class = shift; |
509 | my $node = {}; | |
cb1a09d0 AD |
510 | $node->{LEFT} = $node->{RIGHT} = $node; |
511 | $node->{DATA} = [ @_ ]; | |
512 | return bless $node => $class; | |
54310121 | 513 | } |
cb1a09d0 AD |
514 | |
515 | If you create nodes like that, they (currently) won't go away unless you | |
516 | break their self reference yourself. (In other words, this is not to be | |
517 | construed as a feature, and you shouldn't depend on it.) | |
518 | ||
519 | Almost. | |
520 | ||
521 | When an interpreter thread finally shuts down (usually when your program | |
522 | exits), then a rather costly but complete mark-and-sweep style of garbage | |
523 | collection is performed, and everything allocated by that thread gets | |
524 | destroyed. This is essential to support Perl as an embedded or a | |
54310121 | 525 | multithreadable language. For example, this program demonstrates Perl's |
cb1a09d0 AD |
526 | two-phased garbage collection: |
527 | ||
54310121 | 528 | #!/usr/bin/perl |
cb1a09d0 AD |
529 | package Subtle; |
530 | ||
531 | sub new { | |
532 | my $test; | |
533 | $test = \$test; | |
534 | warn "CREATING " . \$test; | |
535 | return bless \$test; | |
54310121 | 536 | } |
cb1a09d0 AD |
537 | |
538 | sub DESTROY { | |
539 | my $self = shift; | |
540 | warn "DESTROYING $self"; | |
54310121 | 541 | } |
cb1a09d0 AD |
542 | |
543 | package main; | |
544 | ||
545 | warn "starting program"; | |
546 | { | |
547 | my $a = Subtle->new; | |
548 | my $b = Subtle->new; | |
549 | $$a = 0; # break selfref | |
550 | warn "leaving block"; | |
54310121 | 551 | } |
cb1a09d0 AD |
552 | |
553 | warn "just exited block"; | |
554 | warn "time to die..."; | |
555 | exit; | |
556 | ||
2359510d SD |
557 | When run as F</foo/test>, the following output is produced: |
558 | ||
559 | starting program at /foo/test line 18. | |
560 | CREATING SCALAR(0x8e5b8) at /foo/test line 7. | |
561 | CREATING SCALAR(0x8e57c) at /foo/test line 7. | |
562 | leaving block at /foo/test line 23. | |
563 | DESTROYING Subtle=SCALAR(0x8e5b8) at /foo/test line 13. | |
564 | just exited block at /foo/test line 26. | |
565 | time to die... at /foo/test line 27. | |
cb1a09d0 AD |
566 | DESTROYING Subtle=SCALAR(0x8e57c) during global destruction. |
567 | ||
568 | Notice that "global destruction" bit there? That's the thread | |
54310121 | 569 | garbage collector reaching the unreachable. |
cb1a09d0 | 570 | |
14218588 GS |
571 | Objects are always destructed, even when regular refs aren't. Objects |
572 | are destructed in a separate pass before ordinary refs just to | |
cb1a09d0 | 573 | prevent object destructors from using refs that have been themselves |
5f05dabc | 574 | destructed. Plain refs are only garbage-collected if the destruct level |
cb1a09d0 AD |
575 | is greater than 0. You can test the higher levels of global destruction |
576 | by setting the PERL_DESTRUCT_LEVEL environment variable, presuming | |
577 | C<-DDEBUGGING> was enabled during perl build time. | |
64cea5fd | 578 | See L<perlhack/PERL_DESTRUCT_LEVEL> for more information. |
cb1a09d0 AD |
579 | |
580 | A more complete garbage collection strategy will be implemented | |
581 | at a future date. | |
582 | ||
5a964f20 TC |
583 | In the meantime, the best solution is to create a non-recursive container |
584 | class that holds a pointer to the self-referential data structure. | |
585 | Define a DESTROY method for the containing object's class that manually | |
586 | breaks the circularities in the self-referential structure. | |
587 | ||
a0d0e21e LW |
588 | =head1 SEE ALSO |
589 | ||
8257a158 | 590 | A kinder, gentler tutorial on object-oriented programming in Perl can |
890a53b9 | 591 | be found in L<perltoot>, L<perlboot> and L<perltooc>. You should |
8257a158 MS |
592 | also check out L<perlbot> for other object tricks, traps, and tips, as |
593 | well as L<perlmodlib> for some style guides on constructing both | |
594 | modules and classes. |