Recipe18.7.Caching Objects with a FIFO Pruning Strategy


Recipe 18.7. Caching Objects with a FIFO Pruning Strategy

Credit: David M. Wilson, Raymond Hettinger

Problem

You need to build a mapping to be used as a cache, holding up to a fixed number of previous entries and automatically discarding older entries.

Solution

A mapping can implement a relatively small number of core methods and rely on UserDict.DictMixin to provide all the other methods that make up the full official mapping interface. Here is a mapping class for FIFO caching, implemented with this "let DictMixin do it" strategy:

import UserDict class FifoCache(object, UserDict.DictMixin):     ''' A mapping that remembers the last `num_entries' items that were set '''     def _ _init_ _(self, num_entries, dct=( )):         self.num_entries = num_entries         self.dct = dict(dct)         self.lst = [  ]     def _ _repr_ _(self):         return '%r(%r,%r)' % (             self._ _class_ _._ _name_ _, self.num_entries, self.dct)     def copy(self):         return self._ _class_ _(self.num_entries, self.dct)     def keys(self):         return list(self.lst)     def _ _getitem_ _(self, key):         return self.dct[key]     def _ _setitem_ _(self, key, value):         dct = self.dct         lst = self.lst         if key in dct:             lst.remove(key)         dct[key] = value         lst.append(key)         if len(lst) > self.num_entries:             del dct[lst.pop(0)]     def _ _delitem_ _(self, key):         self.dct.pop(key)         self.lst.remove(key)     # a method explicitly defined only as an optimization     def _ _contains_ _(self, item):         return item in self.dct     has_key = _ _contains_ _

Discussion

Here is a typical example of usage for this FifoCache class:

if _ _name_ _ == '_ _main_ _':     f = FifoCache(num_entries=3)     f["fly"] = "foo"     f["moo"] = "two"     f["bar"] = "baz"     f["dave"] = "wilson"     f["age"] = 20     print f.keys( )     # emits ['bar', 'dave', 'age']

For any case where you might use a dictionary object to cache expensive lookups, the FifoCache class shown in this recipe might be a safer alternative for use in long-running applications, whose caches might otherwise consume all system memory if left unchecked.

Thanks to UserDict.DictMixin, class FifoCache exhibits a full dictionary (i.e., mapping) interface: you can substitute an instance of FifoCache wherever you're using a dictionary (as long as you do want entries that were set "a long time ago" to drop out automatically to make space for newer ones).

In Python 2.4, you can get a faster version of FifoCache by setting self.lst to be an instance of collections.deque rather than a list, and using self.lst.popleft() where this recipe's solution uses self.lst.pop(0). Since the deque type does not have a remove method, you have to implement that with a little auxiliary function:

def remove_from_deque(d, x):     for i, v in enumerate(d):         if v == x:             del d[i]         return     raise ValueError, '%r not in %r' % (x, d)

and use remove_from_deque(self.lst, key) where this recipe's Solution uses self.list.remove(key). While, as always, you should measure how useful this optimization is in the context of your specific application, it's likely to be helpful when num_entries is high, since self.lst.pop(0) on a list self.lst is O(n), while self.list.popleft( ) on a deque self.lst is O(1). (remove_from_deque, like list.remove, is unfortunately and unavoidably O(n)).

FIFO is not the ideal policy for a cache's decision about what should "fall off"; a better one would be LRU (Least Recently Used). You can tweak this class' policy into LRU by subclassing and overriding:

class LRUCache(FifoCache):     def _ _getitem_ _(self, key):         if key in self.dct:             self.lst.remove(key)         else:             raise KeyError         self.lst.append(key)         return self.dct[key]

This variant does ensure the use of the LRU policy without much extra code. Unfortunately, it makes every read access quite costly O(n), where n is the number of entries in the cache at that time), due to the need for the self.lst.remove call. Therefore, this recipe's official "Solution" uses the simpler implementation, even though FIFO is notoriously suboptimal as a cache replacement strategy.

See Also

Library Reference and Python in a Nutshell docs for module UserDict; Recipe 5.14 also uses UserDict.DictMixin to round up a mapping interface while coding a minimum of boilerplate.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net