This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Remove all mention of checkpods
[perl5.git] / lib / Memoize / TODO
CommitLineData
3d4a255c 1# Version 0.05 alpha $Revision: 1.6 $ $Date: 2001/06/24 17:11:26 $
a0cb3900
JH
2
3=head1 TO DO
4
5=over 4
6
7=item *
8
9LIST_CACHE doesn't work with ties to most DBM implementations, because
10Memouze tries to save a listref, and DB_File etc. can only store
11strings. This should at least be documented. Maybe Memoize could
12detect the problem at TIE time and throw a fatal error.
13
899dc88a
JH
1420010623 This was added sometime prior to 20001025.
15
a0cb3900
JH
16Try out MLDBM here and document it if it works.
17
18=item *
19
20We should extend the benchmarking module to allow
21
22 timethis(main, { MEMOIZED => [ suba, subb ] })
23
24What would this do? It would time C<main> three times, once with
25C<suba> and C<subb> unmemoized, twice with them memoized.
26
27Why would you want to do this? By the third set of runs, the memo
28tables would be fully populated, so all calls by C<main> to C<suba>
29and C<subb> would return immediately. You would be able to see how
30much of C<main>'s running time was due to time spent computing in
31C<suba> and C<subb>. If that was just a little time, you would know
32that optimizing or improving C<suba> and C<subb> would not have a
33large effect on the performance of C<main>. But if there was a big
34difference, you would know that C<suba> or C<subb> was a good
35candidate for optimization if you needed to make C<main> go faster.
36
37Done.
38
39=item *
40
41Perhaps C<memoize> should return a reference to the original function
42as well as one to the memoized version? But the programmer could
43always construct such a reference themselves, so perhaps it's not
44necessary. We save such a reference anyway, so a new package method
45could return it on demand even if it wasn't provided by C<memoize>.
46We could even bless the new function reference so that it could have
47accessor methods for getting to the original function, the options,
48the memo table, etc.
49
50Naah.
51
52=item *
53
54The TODISK feature is not ready yet. It will have to be rather
55complicated, providing options for which disk method to use (GDBM?
56DB_File? Flat file? Storable? User-supplied?) and which stringizing
57method to use (FreezeThaw? Marshal? User-supplied?)
58
59Done!
60
61=item *
62
63Maybe an option for automatic expiration of cache values? (`After one
64day,' `After five uses,' etc.) Also possibly an option to limit the
65number of active entries with automatic LRU expiration.
66
67You have a long note to Mike Cariaso that outlines a good approach
68that you sent on 9 April 1999.
69
70What's the timeout stuff going to look like?
71
72 EXPIRE_TIME => time_in_sec
73 EXPIRE_USES => num_uses
74 MAXENTRIES => n
75
76perhaps? Is EXPIRE_USES actually useful?
77
7819990916: Memoize::Expire does EXPIRE_TIME and EXPIRE_USES.
79MAXENTRIES can come later as a separate module.
80
81=item *
82
83Put in a better example than C<fibo>. Show an example of a
84nonrecursive function that simply takes a long time to run.
85C<getpwuid> for example? But this exposes the bug that you can't say
86C<memoize('getpwuid')>, so perhaps it's not a very good example.
87
88Well, I did add the ColorToRGB example, but it's still not so good.
89These examples need a lot of work. C<factorial> might be a better
90example than C<fibo>.
91
92=item *
93
94Add more regression tests for normalizers.
95
96=item *
97
98Maybe resolve normalizer function to code-ref at memoize time instead
99of at function call time for efficiency? I think there was some
100reason not to do this, but I can't remember what it was.
101
102=item *
103
104Add more array value tests to the test suite.
105
106Does it need more now?
107
108=item *
109
110Fix that `Subroutine u redefined ... line 484' message.
111
112Fixed, I think.
113
114=item *
115
116Get rid of any remaining *{$ref}{CODE} or similar magic hashes.
117
118=item *
119
120There should be an option to dump out the memoized values or to
121otherwise traverse them.
122
123What for?
124
125Maybe the tied hash interface taskes care of this anyway?
126
127=item *
128
129Include an example that caches DNS lookups.
130
131=item *
132
133Make tie for Storable (Memoize::Storable)
134
135A prototype of Memoize::Storable is finished. Test it and add to the
136test suite.
137
138Done.
139
140=item *
141
142Make tie for DBI (Memoize::DBI)
143
144=item *
145
146I think there's a bug. See `###BUG'.
147
148=item *
149
150Storable probably can't be done, because it doesn't allow updating.
151Maybe a different interface that supports readonly caches fronted by a
152writable in-memory cache? A generic tied hash maybe?
153
154 FETCH {
155 if (it's in the memory hash) {
156 return it
157 } elsif (it's in the readonly disk hash) {
158 return it
159 } else {
160 not-there
161 }
162 }
163
164 STORE {
165 put it into the in-memory hash
166 }
167
168Maybe `save' and `restore' methods?
169
170It isn't working right because the destructor doesn't get called at
171the right time.
172
173This is fixed. `use strict vars' would have caught it immediately. Duh.
174
175=item *
176
177Don't forget about generic interface to Storable-like packages
178
899dc88a 17920010627 It would appear that you put this into 0.51.
a0cb3900 180
899dc88a 181=item *
a0cb3900
JH
182
183Maybe add in TODISK after all, with TODISK => 'filename' equivalent to
184
185 SCALAR_CACHE => [TIE, Memoize::SDBM_File, $filename, O_RDWR|O_CREAT, 0666],
186 LIST_CACHE => MERGE
187
188=item *
189
190Maybe the default for LIST_CACHE should be MERGE anyway.
191
192=item *
193
194There's some terrible bug probably related to use under threaded perl,
195possibly connected with line 56:
196
197 my $wrapper = eval "sub { unshift \@_, qq{$cref}; goto &_memoizer; }";
198
199I think becayse C<@_> is lexically scoped in threadperl, the effect of
200C<unshift> never makes it into C<_memoizer>. That's probably a bug in
201Perl, but maybe I should work around it. Can anyone provide more
202information here, or lend me a machine with threaded Perl where I can
203test this theory? Line 59, currently commented out, may fix the
204problem.
205
899dc88a
JH
20620010623 Working around this in 0.65, but it still blows.
207
a0cb3900
JH
208=item *
209
210Maybe if the original function has a prototype, the module can use
211that to select the most appropriate default normalizer. For example,
212if the prototype was C<($)>, there's no reason to use `join'. If it's
213C<(\@)> then it can use C<join $;,@$_[0];> instead of C<join $;,@_;>.
214
215=item *
216
217Ariel Scolnikov suggests using the change counting problem as an
218example. (How many ways to make change of a dollar?)
219
220=item *
221
899dc88a
JH
222Jonathan Roy found a use for `unmemoize'. If you're using the
223Storable glue, and your program gets SIGINT, you find that the cache
224data is not in the cache, because Perl normally writes it all out at
225once from a DESTROY method, and signals skip DESTROY processing. So
226you could add
a0cb3900
JH
227
228 $sig{INT} = sub { unmemoize ... };
229
a0cb3900
JH
230
231=item *
232
233This means it would be useful to have a method to return references to
234all the currently-memoized functions so that you could say
235
236 $sig{INT} = sub { for $f (Memoize->all_memoized) {
237 unmemoize $f;
238 }
239 }
240
241
242=item *
243
24419990917 There should be a call you can make to get back the cache
245itself. If there were, then you could delete stuff from it to
246manually expire data items.
247
248=item *
249
25019990925 Randal says that the docs for Memoize;:Expire should make it
251clear that the expired entries are never flushed all at once. He
252asked if you would need to do that manually. I said:
253
254 Right, if that's what you want. If you have EXISTS return false,
255 it'll throw away the old cached item and replace it in the cache
256 with a new item. But if you want the cache to actually get smaller,
257 you have to do that yourself.
258
259 I was planning to build an Expire module that implemented an LRU
260 queue and kept the cache at a constant fixed size, but I didn't get
261 to it yet. It's not clear to me that the automatic exptynig-out
262 behavior is very useful anyway. The whole point of a cache is to
263 trade space for time, so why bother going through the cache to throw
264 away old items before you need to?
265
266Randal then pointed out that it could discard expired items at DESTRoY
267or TIEHASH time, which seemed like a good idea, because if the cache
268is on disk you might like to keep it as small as possible.
269
270=item *
271
27219991219 Philip Gwyn suggests this technique: You have a load_file
273function that memoizes the file contexts. But then if the file
274changes you get the old contents. So add a normalizer that does
275
276 return join $;, (stat($_[0])[9]), $_[0];
277
278Now when the modification date changes, the true key returned by the
279normalizer is different, so you get a cache miss and it loads the new
280contents. Disadvantage: The old contents are still in the cache. I
281think it makes more sense to have a special expiration manager for
282this. Make one up and bundle it.
283
28419991220 I have one written: Memoize::ExpireFile. But how can you
285make this work when the function might have several arguments, of
286which some are filenames and some aren't?
287
288=item *
289
29019991219 There should be an inheritable TIEHASH method that does the
291argument processing properly.
292
29319991220 Philip Gwyn contributed a patch for this.
294
29520001231 You should really put this in. Jonathan Roy uncovered a
296problem that it will be needed to solve. Here's the problem: He has:
297
298 memoize "get_items",
299 LIST_CACHE => ["TIE", "Memoize::Expire",
300 LIFETIME => 86400,
301 TIE => ["DB_File", "debug.db", O_CREAT|O_RDWR, 0666]
302 ];
303
304This won't work, because memoize is trying to store listrefs in a
305DB_File. He owuld have gotten a fatal error if he had done this:
306
307 memoize "get_items",
308 LIST_CACHE => ["TIE", "DB_File", "debug.db", O_CREAT|O_RDWR, 0666]'
309
310
311But in this case, he tied the cache to Memoize::Expire, which is *not*
312scalar-only, and the check for scalar-only ties is missing from
313Memoize::Expire. The inheritable method can take care of this.
314
899dc88a
JH
31520010623 I decided not to put it in. Instead, we avoid the problem by
316getting rid of TIE. The HASH option does the same thing, and HASH is
317so simple to support that a module is superfluous.
318
a0cb3900
JH
319=item *
320
32120001130 Custom cache manager that checks to make sure the function
322return values actually match the memoized values.
323
324=item *
325
32620001231 Expiration manager that watches cache performance and
327accumulates statistics. Variation: Have it automatically unmemoize
328the function if performance is bad.
329
330=item *
331
33220010517 Option to have normalizer *modify* @_ for use by memoized
333function. This would save code and time in cases like the one in the
334manual under 'NORMALIZER', where both f() and normalize_f() do the
335same analysis and make the same adjustments to the hash. If the
336normalizer could make the adjustments and save the changes in @_, you
337wouldn't have to do it twice.
338
3d4a255c 339=item *
899dc88a
JH
34020010623 Add CLEAR methods to tied hash modules.
341
3d4a255c 342=item *
899dc88a
JH
34320010623 You get a warning if you try to use DB_File as LIST_CACHE,
344because it won't store lists. But if you use it as the underlying
345cache with an expiration manager in the middle, no warning---the
346expiration manager doesn't know it's managing a list cache, and
347memoize doesn't know that DB_File is underlying. Is this fixable?
348Probably not, but think about it.
349
a0cb3900
JH
350=item *
351There was probably some other stuff that I forgot.
352
353
354
355=back