Credit: Michael Granger with Ben Bleything
You want to transparently cache the results of expensive operations, so that code that triggers the operations doesn need to know how to use the cache. The memcached program, described in Recipe 16.16, lets you use other machines RAM to store key-value pairs. The question is how to hide the use of this cache from the rest of your code.
If you have the luxury of designing your own implementation of the expensive operation, you can design in transparent caching from the beginning. The following code defines a get method that delegates to expensive_get if it can find an appropriate value in the cache. In this case, the expensive operation that gets cached is the (relatively inexpensive, actually) string reversal operation:
require ubygems require memcache class DataLayer def initialize(*cache_servers) @cache = MemCache.new(*cache_servers) end def get(key) @cache[key] ||= expensive_get(key) end alias_method :[], :get protected def expensive_get(key) # …do expensive fetch of data for key puts "Fetching expensive value for #{key}" key.to_s.reverse end end
Assuming youve got a memcached server running on your local machine, you can use this DataLayer as a way to cache the reversed versions of strings:
layer = DataLayer.new( localhost:11211 ) 3.times do puts "Data for foo: #{layer[foo]}" end # Fetching expensive value for foo # Data for foo: oof # Data for foo: oof
Thats the easy case. But you don always get the opportunity to define a data layer from scratch. If you want to add memcaching to an existing data layer, you can create a caching strategy and add it to your existing classes as a mixin.
Heres a data layer, already written, that has no caching:
class MyDataLayer def get(key) puts "Getting value for #{key} from data layer" return key.to_s.reverse end end
The data layer doesn know about the cache, so all of its operations are expensive. In this instance, its reversing a string every time you ask for it:
layer = MyDataLayer.new "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Getting value for foo from data layer # => "Value for foo: oof"
Lets improve performance a little by defining a caching mixin. Itll wrap the get method so that it only runs the expensive code (the string reversal) if the answer isn already in the cache:
require memcache module GetSetMemcaching SERVER = localhost:11211 def self::extended(mod) mod.module_eval do alias_method :__uncached_get, :get remove_method :get def get(key) puts "Cached get of #{key.inspect}" get_cache()[key] ||= __uncached_get(key) end def get_cache puts "Fetching cache object for #{SERVER}" @cache ||= MemCache.new(SERVER) end end super end def self::included(mod) mod.extend(self) super end end
Once we mix GetSetMemcaching into our data layer, the same code we ran before will magically start to use use the cache:
# Mix in caching to the pre-existing class MyDataLayer.extend(GetSetMemcaching) "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # Getting value for foo from data layer # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # => "Value for foo: oof" "Value for foo: #{layer.get(foo)}" # Cached get of "foo" # Fetching cache object for localhost:11211 # => "Value for foo: oof"
The examples above are missing a couple features youd see in real life. Their API is very simple (just get methods), and they have no cache invalidationitems will stay in the cache forever, even if the underlying data changes.
The same basic principles apply to more complex caches, though. When you need a value thats expensive to find or calculate, you first ask the cache for the value, keyed by its identifying feature. The cache might map a SQL query to its result set, a primary key to the corresponding database object, an array of compound keys to the corresponding database object, and so on. If the object is missing from the cache, you fetch it the expensive way, and put it in the cache.
Strings
Numbers
Date and Time
Arrays
Hashes
Files and Directories
Code Blocks and Iteration
Objects and Classes8
Modules and Namespaces
Reflection and Metaprogramming
XML and HTML
Graphics and Other File Formats
Databases and Persistence
Internet Services
Web Development Ruby on Rails
Web Services and Distributed Programming
Testing, Debugging, Optimizing, and Documenting
Packaging and Distributing Software
Automating Tasks with Rake
Multitasking and Multithreading
User Interface
Extending Ruby with Other Languages
System Administration