This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Re: [PATCH 5.6.1] OS2 Configure
[perl5.git] / lib / Memoize / TODO
CommitLineData
a0cb3900
JH
1# Version 0.05 alpha $Revision: 1.5 $ $Date: 1999/09/17 14:57:55 $
2
3=head1 TO DO
4
5=over 4
6
7=item *
8
9LIST_CACHE doesn't work with ties to most DBM implementations, because
10Memouze tries to save a listref, and DB_File etc. can only store
11strings. This should at least be documented. Maybe Memoize could
12detect the problem at TIE time and throw a fatal error.
13
14Try out MLDBM here and document it if it works.
15
16=item *
17
18We should extend the benchmarking module to allow
19
20 timethis(main, { MEMOIZED => [ suba, subb ] })
21
22What would this do? It would time C<main> three times, once with
23C<suba> and C<subb> unmemoized, twice with them memoized.
24
25Why would you want to do this? By the third set of runs, the memo
26tables would be fully populated, so all calls by C<main> to C<suba>
27and C<subb> would return immediately. You would be able to see how
28much of C<main>'s running time was due to time spent computing in
29C<suba> and C<subb>. If that was just a little time, you would know
30that optimizing or improving C<suba> and C<subb> would not have a
31large effect on the performance of C<main>. But if there was a big
32difference, you would know that C<suba> or C<subb> was a good
33candidate for optimization if you needed to make C<main> go faster.
34
35Done.
36
37=item *
38
39Perhaps C<memoize> should return a reference to the original function
40as well as one to the memoized version? But the programmer could
41always construct such a reference themselves, so perhaps it's not
42necessary. We save such a reference anyway, so a new package method
43could return it on demand even if it wasn't provided by C<memoize>.
44We could even bless the new function reference so that it could have
45accessor methods for getting to the original function, the options,
46the memo table, etc.
47
48Naah.
49
50=item *
51
52The TODISK feature is not ready yet. It will have to be rather
53complicated, providing options for which disk method to use (GDBM?
54DB_File? Flat file? Storable? User-supplied?) and which stringizing
55method to use (FreezeThaw? Marshal? User-supplied?)
56
57Done!
58
59=item *
60
61Maybe an option for automatic expiration of cache values? (`After one
62day,' `After five uses,' etc.) Also possibly an option to limit the
63number of active entries with automatic LRU expiration.
64
65You have a long note to Mike Cariaso that outlines a good approach
66that you sent on 9 April 1999.
67
68What's the timeout stuff going to look like?
69
70 EXPIRE_TIME => time_in_sec
71 EXPIRE_USES => num_uses
72 MAXENTRIES => n
73
74perhaps? Is EXPIRE_USES actually useful?
75
7619990916: Memoize::Expire does EXPIRE_TIME and EXPIRE_USES.
77MAXENTRIES can come later as a separate module.
78
79=item *
80
81Put in a better example than C<fibo>. Show an example of a
82nonrecursive function that simply takes a long time to run.
83C<getpwuid> for example? But this exposes the bug that you can't say
84C<memoize('getpwuid')>, so perhaps it's not a very good example.
85
86Well, I did add the ColorToRGB example, but it's still not so good.
87These examples need a lot of work. C<factorial> might be a better
88example than C<fibo>.
89
90=item *
91
92Add more regression tests for normalizers.
93
94=item *
95
96Maybe resolve normalizer function to code-ref at memoize time instead
97of at function call time for efficiency? I think there was some
98reason not to do this, but I can't remember what it was.
99
100=item *
101
102Add more array value tests to the test suite.
103
104Does it need more now?
105
106=item *
107
108Fix that `Subroutine u redefined ... line 484' message.
109
110Fixed, I think.
111
112=item *
113
114Get rid of any remaining *{$ref}{CODE} or similar magic hashes.
115
116=item *
117
118There should be an option to dump out the memoized values or to
119otherwise traverse them.
120
121What for?
122
123Maybe the tied hash interface taskes care of this anyway?
124
125=item *
126
127Include an example that caches DNS lookups.
128
129=item *
130
131Make tie for Storable (Memoize::Storable)
132
133A prototype of Memoize::Storable is finished. Test it and add to the
134test suite.
135
136Done.
137
138=item *
139
140Make tie for DBI (Memoize::DBI)
141
142=item *
143
144I think there's a bug. See `###BUG'.
145
146=item *
147
148Storable probably can't be done, because it doesn't allow updating.
149Maybe a different interface that supports readonly caches fronted by a
150writable in-memory cache? A generic tied hash maybe?
151
152 FETCH {
153 if (it's in the memory hash) {
154 return it
155 } elsif (it's in the readonly disk hash) {
156 return it
157 } else {
158 not-there
159 }
160 }
161
162 STORE {
163 put it into the in-memory hash
164 }
165
166Maybe `save' and `restore' methods?
167
168It isn't working right because the destructor doesn't get called at
169the right time.
170
171This is fixed. `use strict vars' would have caught it immediately. Duh.
172
173=item *
174
175Don't forget about generic interface to Storable-like packages
176
177=item *
178
179
180Maybe add in TODISK after all, with TODISK => 'filename' equivalent to
181
182 SCALAR_CACHE => [TIE, Memoize::SDBM_File, $filename, O_RDWR|O_CREAT, 0666],
183 LIST_CACHE => MERGE
184
185=item *
186
187Maybe the default for LIST_CACHE should be MERGE anyway.
188
189=item *
190
191There's some terrible bug probably related to use under threaded perl,
192possibly connected with line 56:
193
194 my $wrapper = eval "sub { unshift \@_, qq{$cref}; goto &_memoizer; }";
195
196I think becayse C<@_> is lexically scoped in threadperl, the effect of
197C<unshift> never makes it into C<_memoizer>. That's probably a bug in
198Perl, but maybe I should work around it. Can anyone provide more
199information here, or lend me a machine with threaded Perl where I can
200test this theory? Line 59, currently commented out, may fix the
201problem.
202
203=item *
204
205Maybe if the original function has a prototype, the module can use
206that to select the most appropriate default normalizer. For example,
207if the prototype was C<($)>, there's no reason to use `join'. If it's
208C<(\@)> then it can use C<join $;,@$_[0];> instead of C<join $;,@_;>.
209
210=item *
211
212Ariel Scolnikov suggests using the change counting problem as an
213example. (How many ways to make change of a dollar?)
214
215=item *
216
217I found a use for `unmemoize'. If you're using the Storable glue, and
218your program gets SIGINT, you find that the cache data is not in the
219cache, because Perl normally writes it all out at once from a
220DESTROY method, and signals skip DESTROY processing. So you could add
221
222 $sig{INT} = sub { unmemoize ... };
223
224(Jonathan Roy pointed this out)
225
226=item *
227
228This means it would be useful to have a method to return references to
229all the currently-memoized functions so that you could say
230
231 $sig{INT} = sub { for $f (Memoize->all_memoized) {
232 unmemoize $f;
233 }
234 }
235
236
237=item *
238
23919990917 There should be a call you can make to get back the cache
240itself. If there were, then you could delete stuff from it to
241manually expire data items.
242
243=item *
244
24519990925 Randal says that the docs for Memoize;:Expire should make it
246clear that the expired entries are never flushed all at once. He
247asked if you would need to do that manually. I said:
248
249 Right, if that's what you want. If you have EXISTS return false,
250 it'll throw away the old cached item and replace it in the cache
251 with a new item. But if you want the cache to actually get smaller,
252 you have to do that yourself.
253
254 I was planning to build an Expire module that implemented an LRU
255 queue and kept the cache at a constant fixed size, but I didn't get
256 to it yet. It's not clear to me that the automatic exptynig-out
257 behavior is very useful anyway. The whole point of a cache is to
258 trade space for time, so why bother going through the cache to throw
259 away old items before you need to?
260
261Randal then pointed out that it could discard expired items at DESTRoY
262or TIEHASH time, which seemed like a good idea, because if the cache
263is on disk you might like to keep it as small as possible.
264
265=item *
266
26719991219 Philip Gwyn suggests this technique: You have a load_file
268function that memoizes the file contexts. But then if the file
269changes you get the old contents. So add a normalizer that does
270
271 return join $;, (stat($_[0])[9]), $_[0];
272
273Now when the modification date changes, the true key returned by the
274normalizer is different, so you get a cache miss and it loads the new
275contents. Disadvantage: The old contents are still in the cache. I
276think it makes more sense to have a special expiration manager for
277this. Make one up and bundle it.
278
27919991220 I have one written: Memoize::ExpireFile. But how can you
280make this work when the function might have several arguments, of
281which some are filenames and some aren't?
282
283=item *
284
28519991219 There should be an inheritable TIEHASH method that does the
286argument processing properly.
287
28819991220 Philip Gwyn contributed a patch for this.
289
29020001231 You should really put this in. Jonathan Roy uncovered a
291problem that it will be needed to solve. Here's the problem: He has:
292
293 memoize "get_items",
294 LIST_CACHE => ["TIE", "Memoize::Expire",
295 LIFETIME => 86400,
296 TIE => ["DB_File", "debug.db", O_CREAT|O_RDWR, 0666]
297 ];
298
299This won't work, because memoize is trying to store listrefs in a
300DB_File. He owuld have gotten a fatal error if he had done this:
301
302 memoize "get_items",
303 LIST_CACHE => ["TIE", "DB_File", "debug.db", O_CREAT|O_RDWR, 0666]'
304
305
306But in this case, he tied the cache to Memoize::Expire, which is *not*
307scalar-only, and the check for scalar-only ties is missing from
308Memoize::Expire. The inheritable method can take care of this.
309
310=item *
311
31220001130 Custom cache manager that checks to make sure the function
313return values actually match the memoized values.
314
315=item *
316
31720001231 Expiration manager that watches cache performance and
318accumulates statistics. Variation: Have it automatically unmemoize
319the function if performance is bad.
320
321=item *
322
32320010517 Option to have normalizer *modify* @_ for use by memoized
324function. This would save code and time in cases like the one in the
325manual under 'NORMALIZER', where both f() and normalize_f() do the
326same analysis and make the same adjustments to the hash. If the
327normalizer could make the adjustments and save the changes in @_, you
328wouldn't have to do it twice.
329
330=item *
331There was probably some other stuff that I forgot.
332
333
334
335=back