Recipe 18.7. Caching Objects with a FIFO Pruning StrategyCredit: David M. Wilson, Raymond Hettinger ProblemYou need to build a mapping to be used as a cache, holding up to a fixed number of previous entries and automatically discarding older entries. SolutionA mapping can implement a relatively small number of core methods and rely on UserDict.DictMixin to provide all the other methods that make up the full official mapping interface. Here is a mapping class for FIFO caching, implemented with this "let DictMixin do it" strategy: import UserDict class FifoCache(object, UserDict.DictMixin): ''' A mapping that remembers the last `num_entries' items that were set ''' def _ _init_ _(self, num_entries, dct=( )): self.num_entries = num_entries self.dct = dict(dct) self.lst = [ ] def _ _repr_ _(self): return '%r(%r,%r)' % ( self._ _class_ _._ _name_ _, self.num_entries, self.dct) def copy(self): return self._ _class_ _(self.num_entries, self.dct) def keys(self): return list(self.lst) def _ _getitem_ _(self, key): return self.dct[key] def _ _setitem_ _(self, key, value): dct = self.dct lst = self.lst if key in dct: lst.remove(key) dct[key] = value lst.append(key) if len(lst) > self.num_entries: del dct[lst.pop(0)] def _ _delitem_ _(self, key): self.dct.pop(key) self.lst.remove(key) # a method explicitly defined only as an optimization def _ _contains_ _(self, item): return item in self.dct has_key = _ _contains_ _ DiscussionHere is a typical example of usage for this FifoCache class: if _ _name_ _ == '_ _main_ _': f = FifoCache(num_entries=3) f["fly"] = "foo" f["moo"] = "two" f["bar"] = "baz" f["dave"] = "wilson" f["age"] = 20 print f.keys( ) # emits ['bar', 'dave', 'age'] For any case where you might use a dictionary object to cache expensive lookups, the FifoCache class shown in this recipe might be a safer alternative for use in long-running applications, whose caches might otherwise consume all system memory if left unchecked. Thanks to UserDict.DictMixin, class FifoCache exhibits a full dictionary (i.e., mapping) interface: you can substitute an instance of FifoCache wherever you're using a dictionary (as long as you do want entries that were set "a long time ago" to drop out automatically to make space for newer ones). In Python 2.4, you can get a faster version of FifoCache by setting self.lst to be an instance of collections.deque rather than a list, and using self.lst.popleft() where this recipe's solution uses self.lst.pop(0). Since the deque type does not have a remove method, you have to implement that with a little auxiliary function: def remove_from_deque(d, x): for i, v in enumerate(d): if v == x: del d[i] return raise ValueError, '%r not in %r' % (x, d) and use remove_from_deque(self.lst, key) where this recipe's Solution uses self.list.remove(key). While, as always, you should measure how useful this optimization is in the context of your specific application, it's likely to be helpful when num_entries is high, since self.lst.pop(0) on a list self.lst is O(n), while self.list.popleft( ) on a deque self.lst is O(1). (remove_from_deque, like list.remove, is unfortunately and unavoidably O(n)). FIFO is not the ideal policy for a cache's decision about what should "fall off"; a better one would be LRU (Least Recently Used). You can tweak this class' policy into LRU by subclassing and overriding: class LRUCache(FifoCache): def _ _getitem_ _(self, key): if key in self.dct: self.lst.remove(key) else: raise KeyError self.lst.append(key) return self.dct[key] This variant does ensure the use of the LRU policy without much extra code. Unfortunately, it makes every read access quite costly O(n), where n is the number of entries in the cache at that time), due to the need for the self.lst.remove call. Therefore, this recipe's official "Solution" uses the simpler implementation, even though FIFO is notoriously suboptimal as a cache replacement strategy. See AlsoLibrary Reference and Python in a Nutshell docs for module UserDict; Recipe 5.14 also uses UserDict.DictMixin to round up a mapping interface while coding a minimum of boilerplate. |