Multiplying Matrices

Problem

You want to turn arrays of arrays of numbers into mathematical matrices, and multiply the matrices together.

Solution

You can create Matrix objects from arrays of arrays, and multiply them together with the * operator:

	require ' 
matrix'
	require 'mathn'

	a1 = [[1, 1, 0, 1],
	 [2, 0, 1, 2],
	 [3, 1, 1, 2]]
	m1 = 
Matrix[*a1]
	# => 
Matrix[[1, 1, 0, 1], [2, 0, 1, 2], [3, 1, 1, 2]]

	a2 = [[1, 0],
	 [3, 1], 
	 [1, 0], 
	 [2, 2.5]] 
	m2 = Matrix[*a2]
	# => Matrix[[1, 0], [3, 1], [1, 0], [2, 2.5]]

	m1 * m2
	# => Matrix[[6, 3.5], [7, 5.0], [11, 6.0]]

Note the unusual syntax for creating a Matrix object: you pass the rows of the matrix into the array indexing operator, not into Matrix#new (which is private).

Discussion

Ruby's Matrix class overloads the arithmetic operators to support all the basic matrix arithmetic operations, including multiplication, between matrices of compatible dimension. If you perform an arithmetic operation on incompatible matrices, you'll get an ExceptionForMatrix::ErrDimensionMismatch.

Multiplying one matrix by another is simple enough, but multiplying a chain of matrices together can be faster or slower depending on the order in which you do the multiplications. This follows from the fact that multiplying a matrix with dimensions K x M, by a matrix with dimensions MxN, requires K * M * N operations and gives a matrix with dimension K * N. If K is large for some matrix, you can save time by waiting til the end before doing multiplications involving that matrix.

Consider three matrices A, B, and C, which you want to multiply together. A has 100 rows and 20 columns. B has 20 rows and 10 columns. C has 10 rows and one column.

Since matrix multiplication is associative, you'll get the same results whether you multiply A by B and then the result by C, or multiply B by C and then the result by A. But multiplying A by B requires 20,000 operations (100 * 20 * 10), and multiplying (AB) by C requires another 1,000 (100 * 10 * 1). Multiplying B by C only requires 200 operations (20 * 10 * 1), and multiplying the result by A requires 2,000 more (100 * 20 * 1). It's almost 10 times faster to multiply A(BC) instead of the naive order of (AB)C.

That kind of potential savings justifies doing some up-front work to find the best order for the multiplication. Here is a method that recursively figures out the most efficient multiplication order for a list of Matrix objects, and another method that actually carries out the multiplications. They share an array containing information about where to divide up the list of matrices: where to place the parentheses, if you will.

	class Matrix
	 def self.multiply(*matrices) 
	 cache = []
	 matrices.size.times { cache << [nil] * matrices.size }
	 best_split(cache, 0, matrices.size-1, *matrices)
	 multiply_following_cache(cache, 0, matrices.size-1, *matrices)
	 end

Because the methods that do the actual work pass around recursion arguments that the end user doesn't care about, I've created Matrix.multiply, a wrapper method for the methods that do the real work. These methods are defined below (Matrix.best_split and Matrix.multiply_following_cache). Matrix.multiply_following_cache assumes that the optimal way to multiply that list of Matrix objects has already been found and encoded in a variable cache. It recursively performs the matrix multiplications in the optimal order, as determined by the cache.

	:private
	def self.multiply_following_cache(cache, chunk_start, chunk_end, *matrices)
	 if chunk_end == chunk_start
	 # There's only one matrix in the list; no need to multiply.
	 return matrices[chunk_start]
	 elsif chunk_end-chunk_start == 1
	 # There are only two matrices in the list; just multiply them together.
	 lhs, rhs = matrices[chunk_start..chunk_end]
	 else
	 # There are more than two matrices in the list. Look in the
	 # cache to see where the optimal split is located. Multiply
	 # together all matrices to the left of the split (recursively,
	 # in the optimal order) to get our equation's left-hand
	 # side. Similarly for all matrices to the right of the split, to
	 # get our right-hand side.
	 split_after = cache[chunk_start][chunk_end][1]
	 lhs = multiply_following_cache(cache, chunk_start, split_after, * 
matrices)
	 rhs = multiply_following_cache(cache, split_after+1, chunk_end, * 
matrices)
	 end

	 # Begin debug code: this illustrates the order of multiplication,
	 # showing the matrices in terms of their dimensions rather than their
	 # (possibly enormous) contents.
	 if $DEBUG 
	 lhs_dim = "#{lhs.row_size}x#{lhs.column_size}"
	 rhs_dim = "#{rhs.row_size}x#{rhs.column_size}"
	 cost = lhs.row_size * lhs.column_size * rhs.column_size
	 puts "Multiplying #{lhs_dim} by #{rhs_dim}: cost #{cost}"
	 end

	 # Do a matrix multiplication of the two matrices, whether they are
	 # the only two matrices in the list or whether they were obtained
	 # through two recursive calls.
	 return lhs * rhs
	end

Finally, here's the method that actually figures out the best way of splitting up the multiplcations. It builds the cache used by the multiply_following_cache method defined above. It also uses the cache as it builds it, so that it doesn't solve the same subproblems over and over again.

	 def self.best_split(cache, chunk_start, chunk_end, *matrices)
	 if chunk_end == chunk_start
	 cache[chunk_start][chunk_end] = [0, nil]
	 end
	 return cache[chunk_start][chunk_end] if cache[chunk_start][chunk_end]

	 #Try splitting the chunk at each possible location and find the
	 #minimum cost of doing the split there. Then pick the smallest of
	 #the minimum costs: that's where the split should actually happen.
	 minimum_costs = [] 
	 chunk_start.upto(chunk_end-1) do |split_after|
	 lhs_cost = best_split(cache, chunk_start, split_after, *matrices)[0]
	 rhs_cost = best_split(cache, split_after+1, chunk_end, *matrices)[0]

	 lhs_rows = matrices[chunk_start].row_size
	 rhs_rows = matrices[split_after+1].row_size
	 rhs_cols = matrices[chunk_end].column_size
	 merge_cost = lhs_rows * rhs_rows * rhs_cols
	 cost = lhs_cost + rhs_cost + merge_cost
	 minimum_costs << cost
	 end
	 minimum = minimum_costs.min
	 minimum_index = chunk_start + minimum_costs.index(minimum)
	 return cache[chunk_start][chunk_end] = [minimum, minimum_index]
	 end
	end

A simple test confirms the example set of matrices spelled out earlier. Remember that we had a 100 x 20 matrix (A), a 20 x 10 matrix (B), and a 20 x 1 matrix (C). Our method should be able to figure out that it's faster to multiply A(BC) than the naive multiplication (AB)C. Since we don't care about the contents of the matrices, just the dimensions, we'll first define some helper methods that make it easy to generate matrices with specific dimensions but random contents.

	class Matrix 
	 # Creates a randomly populated matrix with the given dimensions.
	 def self.with_dimensions(rows, cols)
	 a = []
	 rows.times { a << []; cols.times { a[-1] << rand(10) } }
	 return Matrix[*a]
	 end

	 # Creates an array of 
matrices that can be multiplied together
	 def self.multipliable_chain(*rows) 
	 matrices = []
	 0.upto(rows.size-2) do |i| 
	 matrices << Matrix.with_dimensions(rows[i], rows[i+1])
	 end
	 return matrices 
	 end
	end

After all that, the test is kind of anticlimactic:

	# Create an array of matrices 100x20, 20x10, 10x1.
	chain = Matrix.multipliable_chain(100, 20, 10, 1)

	# Multiply those matrices two different ways, giving the same result.
	Matrix.multiply(*chain) == (chain[0] * chain[1] * chain[2])
	# Multiplying 20x10 by 10x1: cost 200
	# Multiplying 100x20 by 20x1: cost 2000
	# => true

We can use the Benchmark library to verify that matrix multiplication goes much faster when we do the multiplications in the right order:

	# We'll generate the dimensions and contents of the matrices randomly,
	# so no one can accuse us of cheating. 
	dimensions = [] 
	10.times { dimensions << rand(90)+10 }
	chain = Matrix.multipliable_chain(*dimensions)

	require 'benchmark'
	result_1 = nil
	result_2 = nil
	Benchmark.bm(11) do |b|
	 b.report("Unoptimized") do
	 result_1 = chain[0]
	 chain[1..chain.size].each { |c| result_1 *= c }
	 end
	 b.report("Optimized") { result_2 = Matrix.multiply(*chain) }
	end
	# user system total real
	# Unoptimized 4.350000 0.400000 4.750000 ( 11.104857)
	# Optimized 1.410000 0.110000 1.520000 ( 3.559470)

	# Both multiplications give the same result.
	result_1 == result_2 # => true