Using an Array or Other Modifiable Object as a Hash Key

Problem

You want to use a modifiable built-in object (an array or a hash, but not a string) as a key in a hash, even while you modify the object in place. A naive solution tends to lose hash values once the keys are modified:

	coordinates = [10, 5]
	treasure_map = { coordinates => 'jewels' }
	treasure_map[coordinates] # => "jewels"

	# Add a z-coordinate to indicate how deep the treasure is buried.
	coordinates << -5

	coordinates # => [10, 5, -5]
	treasure_map[coordinates] # => nil
	 # Oh no!

 

Solution

The easiest solution is to call the Hash#rehash method every time you modify one of the hash's keys. Hash#rehash will repair the broken treasure map defined above:

	treasure_map.rehash
	treasure_map[coordinates] # => "jewels"

If this is too much code, you might consider changing the definition of the object you use as a hash key, so that modifications don't affect the way the hash treats it.

Suppose you want a reliably hashable Array class. If you want this behavior universally, you can reopen the Array class and redefine hash to give you the new behavior. But it's safer to define a subclass of Array that implements a reliable-hashing mixin, and to use that subclass only for the Arrays you want to use as hash keys:

	module ReliablyHashable
	 def hash
	 return object_id
	 end
	end

	class ReliablyHashableArray < Array
	 include ReliablyHashable
	end

It's now possible to keep track of the jewels:

	coordinates = ReliablyHashableArray.new([10,5])
	treasure_map = { coordinates => 'jewels' }
	treasure_map[coordinates] # => "jewels"

	# Add a z-coordinate to indicate how deep the treasure is buried.
	coordinates.push(-5)

	treasure_map[coordinates] # => "jewels"

 

Discussion

Ruby performs hash lookups using not the key object itself but the object's hash code (an integer obtained from the key by calling its hash method). The default implementation of hash, in Object, uses an object's internal ID as its hash code. Array, Hash, and String override this method to provide different behavior.

In the initial example, the hash code of [10,5] is 41 and the hash code of [10,5,5] is83. The mapping of the coordinate list to 'jewels' is still present (it'll still show up in an iteration over each_pair), but once you change the coordinate list, you can no longer use that variable as a key.

You may also run into this problem when you use a hash or a string as a hash key, and then modify the key in place. This happens because the hash implementations of many built-in classes try to make sure that two objects that are "the same" (for instance, two distinct arrays with the same contents, or two distinct but identical strings) get the same hash value. When coordinates is [10,5], it has a hash code of 41, like any other Array containing [10,5]. When coordinates is [10,5,5] it has a hash code of83, like any other Array with those contents.

Because of the potential for confusion, some languages don't let you use arrays or hashes as hash keys at all. Ruby lets you do it, but you have to face the consequences if the key changes. Fortunately, you can dodge the consequences by overriding hash to work the way you want.

Since an object's internal ID never changes, the Object implementation is what you want to get reliable hashing. To get it back, you'll have to override or subclass the hash method of Array or Hash (depending on what type of key you're having trouble with).

The implementations of hash given in the solution violate the principle that different representations of the same data should have the same hash code. This means that two ReliablyHashableArray objects will have different hash codes even if they have the same contents. For instance:

	a = [1,2]
	b = a.clone
	a.hash # => 11
	b.hash # => 11
	a = 
ReliablyHashableArray.new([1,2])
	b = a.clone
	a.hash # => -606031406
	b.hash # => -606034266

If you want a particular value in a hash to be accessible by two different arrays with the same contents, then you must key it to a regular array instead of a ReliablyHashableArray. You can't have it both ways. If an object is to have the same hash key as its earlier self, it can't also have the same hash key as another representation of its current state.

Another solution is to freeze your hash keys. Any frozen object can be reliably used as a hash key, since you can't do anything to a frozen object that would cause its hash code to change. Ruby uses this solution: when you use a string as a hash key, Ruby copies the string, freezes the copy, and uses that as the actual hash key.

See Also

  • Recipe 8.15, "Freezing an Object to Prevent Changes"


Strings

Numbers

Date and Time

Arrays

Hashes

Files and Directories

Code Blocks and Iteration

Objects and Classes8

Modules and Namespaces

Reflection and Metaprogramming

XML and HTML

Graphics and Other File Formats

Databases and Persistence

Internet Services

Web Development Ruby on Rails

Web Services and Distributed Programming

Testing, Debugging, Optimizing, and Documenting

Packaging and Distributing Software

Automating Tasks with Rake

Multitasking and Multithreading

User Interface

Extending Ruby with Other Languages

System Administration



Ruby Cookbook
Ruby Cookbook (Cookbooks (OReilly))
ISBN: 0596523696
EAN: 2147483647
Year: N/A
Pages: 399

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net