Computing Set Operations on Arrays

Problem

You want to find the union, intersection, difference, or Cartesian product of two arrays, or the complement of a single array with respect to some universe.

Solution

Array objects have overloaded arithmetic and logical operators to provide the three simplest set operations:

	#Union
	[1,2,3] | [1,4,5] # => [1, 2, 3, 4, 5] 

	#Intersection
	[1,2,3] & [1,4,5] # => [1]

	#Difference
	[1,2,3] - [1,4,5] # => [2, 3]

Set objects overload the same operators, as well as the exclusive-or operator (^).If you already have Arrays, though, it's more efficient to deconstruct the XOR operation into its three component operations.

	require 'set' 
	a = [1,2,3]
	b = [3,4,5]
	a.to_set ^ b.to_set # => # 
	(a | b) - (a & b) # => [1, 2, 4, 5]

 

Discussion

Set objects are intended to model mathematical sets: where arrays are ordered and can contain duplicate entries, Sets model an unordered collection of unique items. Set not only overrides operators for set operations, it provides English-language aliases for the three most common operators: Set#union, Set#intersection, and Set#difference. An array can only perform a set operation on another array, but a Set can perform a set operation on any Enumerable.

	array = [1,2,3]
	set = [3,4,5].to_s 
	array & set # => TypeError: can't convert Set into Array
	set & array # => #

You might think that Set objects would be optimized for set operations, but they're actually optimized for constant-time membership checks (internally, a Set is based on a hash). Set union is faster when the left-hand object is a Set object, but intersection and difference are significantly faster when both objects are arrays. It's not worth it to convert arrays into Sets just so you can say you performed set operations on Set objects.

The union and intersection set operations remove duplicate entries from arrays. The difference operation does not remove duplicate entries from an array except as part of a subtraction.

	[3,3] & [3,3] # => [3]
	[3,3] | [3,3] # => [3]
	[1,2,3,3] - [1] # => [2, 3, 3]
	[1,2,3,3] - [3] # => [1, 2]
	[1,2,3,3] - [2,2,3] # => [1]

 

Complement

If you want the complement of an array with respect to some small universe, create that universe and use the difference operation:

	u = [:red, :orange, :yellow, :green, :blue, :indigo, :violet]
	a = [:red, :blue]
	u - a # => [:orange, :yellow, :green, :indigo, :violet]

More often, the relevant universe is infinite (the set of natural numbers)or extremely large (the set of three-letter strings). The best strategy here is to define a generator and use it to iterate through the complement. Be sure to break when you're done; you don't want to iterate over an infinite set.

	def natural_numbers_except(exclude) 
	 exclude_map = {}
	 exclude.each { |x| exclude_map[x] = true }
	 x = 1
	 while true
	 yield x unless exclude_map[x] 
	 x = x.succ
	 end
	end

	natural_numbers_except([2,3,6,7]) do |x| 
	 break if x > 10 
	 puts x 
	end
	# 1
	# 4
	# 5
	# 8
	# 9
	# 10

 

Cartesian product

To get the Cartesian product of two arrays, write a nested iteration over both lists and append each pair of items to a new array. This code is attached to Enumerable so you can also use it with Sets or any other Enumerable.

	module Enumberable
	 def cartesian(other)
	 res = []
	 each { |x| other.each { |y| res << [x, y] } } 
	 return res
	 end
	end

	[1,2,3].cartesian(["a",5,6])
	# => [[1, "a"], [1, 5], [1, 6],
	# [2, "a"], [2, 5], [2, 6],
	# [3, "a"], [3, 5], [3, 6] 

This version uses Enumerable#inject to make the code more concise; however, the original version is more efficient.

	module Enumerable 
	 def cartesian(other)
	 inject([]) { |res, x| other.inject(res) { |res, y| res << [x,y] } }
	 end
	end

 

See Also

  • See Recipe 2.5, "Generating Random Numbers," for an example (constructing a deck of cards from suits and ranks)that could benefit from a function to calculate the Cartesian product
  • Recipe 2.10, "Multiplying Matrices"


Strings

Numbers

Date and Time

Arrays

Hashes

Files and Directories

Code Blocks and Iteration

Objects and Classes8

Modules and Namespaces

Reflection and Metaprogramming

XML and HTML

Graphics and Other File Formats

Databases and Persistence

Internet Services

Web Development Ruby on Rails

Web Services and Distributed Programming

Testing, Debugging, Optimizing, and Documenting

Packaging and Distributing Software

Automating Tasks with Rake

Multitasking and Multithreading

User Interface

Extending Ruby with Other Languages

System Administration



Ruby Cookbook
Ruby Cookbook (Cookbooks (OReilly))
ISBN: 0596523696
EAN: 2147483647
Year: N/A
Pages: 399

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net