Section 8.2. Working with Hashes

8.1. Working with Arrays

Arrays in Ruby are indexed by integers and are zero-based, just like C arrays. There the resemblance ends, however.

A Ruby array is dynamic. It is possible (but not necessary) to specify its size when you create it. After creation, it can grow as needed without any intervention by the programmer.

A Ruby array is heterogeneous in the sense that it can store multiple data types rather than just one type. Actually, it stores object references rather than the objects themselves, except in the case of immediate values such as Fixnums.

An array keeps up with its own length so that we don't have to waste our time calculating it or keeping an external variable in sync with the array. Iterators also are defined so that, in practice, we rarely need to know the array length anyway.

Finally, the Array class in Ruby provides arrays with many useful functions for accessing, searching, concatenating, and otherwise manipulating arrays. We'll spend the remainder of this section exploring the built-in functionality and expanding on it.

8.1.1. Creating and Initializing an Array

The special class method [] is used to create an array; the data items listed in the brackets are used to populate the array. The three ways of calling this method are shown in the following lines. (Arrays a, b, and c will all be populated identically.)

a = Array.[](1,2,3,4) b = Array[1,2,3,4] c = [1,2,3,4]

There is also a class method called new that can take 0, 1, or 2 parameters. The first parameter is the initial size of the array (number of elements). The second parameter is the initial value of each of the elements:

d = Array.new               # Create an empty array e = Array.new(3)            # [nil, nil, nil] f = Array.new(3, "blah")    # ["blah", "blah", "blah"]

Look carefully at the last line of this preceding example. A common "beginner's error" is to think that the objects in the array are distinct. Actually, they are three references to the same object. Therefore if you change that object (as opposed to replacing it with another object), you will change all elements of the array. To avoid this behavior, use a block. Then that block is evaluated once for each element, and all elements are different objects:

f[0].capitalize!            # f is now: ["Blah", "Blah", "Blah"] g = Array.new(3) { "blah" } # ["blah", "blah", "blah"] g[0].capitalize!            # g is now: ["Blah", "blah", "blah"]

8.1.2. Accessing and Assigning Array Elements

Element reference and assignment are done using the class methods [] and []=, respectively. Each can take an integer parameter, a pair of integers (start and length), or a range. A negative index counts backward from the end of the array, starting at -1.

The special instance method at works like the simple case of element reference. Because it can take only a single integer parameter, it is slightly faster.

a = [1, 2, 3, 4, 5, 6] b = a[0]                # 1 c = a.at(0)             # 1 d = a[-2]               # 5 e = a.at(-2)            # 5 f = a[9]                # nil g = a.at(9)             # nil h = a[3,3]              # [4, 5, 6] i = a[2..4]             # [3, 4, 5] j = a[2...4]            # [3, 4] a[1] = 8                # [1, 8, 3, 4, 5, 6] a[1,3] = [10, 20, 30]   # [1, 10, 20, 30, 5, 6] a[0..3] = [2, 4, 6, 8]  # [2, 4, 6, 8, 5, 6] a[-1] = 12              # [2, 4, 6, 8, 5, 12]

Note in the following example how a reference beyond the end of the array causes the array to grow. Note also that a subarray can be replaced with more elements than were originally there, also causing the array to grow.

k = [2, 4, 6, 8, 10] k[1..2] = [3, 3, 3]     # [2, 3, 3, 3, 8, 10] k[7] = 99               # [2, 3, 3, 3, 8, 10, nil, 99]

Finally, we should mention that an array assigned to a single element actually inserts that element as a nested array (unlike an assignment to a range):

m = [1, 3, 5, 7, 9] m[2] = [20, 30]         # [1, 3, [20, 30], 7, 9] # On the other hand... m = [1, 3, 5, 7, 9] m[2..2] = [20, 30]      # [1, 3, 20, 30, 7, 9]

The method slice is simply an alias for the [] method:

x = [0, 2, 4, 6, 8, 10, 12] a = x.slice(2)               # 4 b = x.slice(2,4)             # [4, 6, 8, 10] c = x.slice(2..4)            # [4, 6, 8]

The special methods first and last return the first and last elements of an array, respectively. They return nil if the array is empty:

x = %w[alpha beta gamma delta epsilon] a = x.first      # "alpha" b = x.last       # "epsilon"

We have seen that some of the element-referencing techniques actually can return an entire subarray. There are other ways to access multiple elements, which we'll look at now.

The method values_at takes a list of indices (or indexes, if you prefer) and returns an array consisting of only those elements. It can be used where a range cannot (when the elements are not all contiguous).

In previous versions of Ruby, values_at was called indices, with an alias called indexes. These are no longer to be used.

x = [10, 20, 30, 40, 50, 60] y = x.values_at(0, 1, 4)        # [10, 20, 50] z = x.values_at(0..2,5)         # [10, 20, 30, 60]

8.1.3. Finding an Array's Size

The method length or its alias size gives the number of elements in an array. (As always, this is one greater than the index of the last item.)

x = ["a", "b", "c", "d"] a = x.length               # 4 b = x.size                 # 4

The method nitems is the same except that it does not count nil elements:

y = [1, 2, nil, nil, 3, 4] c = y.size                 # 6 d = y.length               # 6 e = y.nitems               # 4

8.1.4. Comparing Arrays

Comparing arrays is tricky. If you do it at all, do it with caution.

The instance method <=> is used to compare arrays. It works the same as in the other contexts in which it is used, returning either -1 (meaning "less than"), 0 (meaning "equal"), or 1 (meaning "greater than"). The methods == and != depend on this method.

Arrays are compared in an "elementwise" manner; the first two elements that are not equal determine the inequality for the whole comparison. (Thus preference is given to the leftmost elements, just as if we were comparing two long integers "by eye," looking at one digit at a time.)

a = [1, 2, 3, 9, 9] b = [1, 2, 4, 1, 1] c = a <=> b            # -1 (meaning a < b)

If all elements are equal, the arrays are equal. If one array is longer than another, and they are equal up to the length of the shorter array, the longer array is considered to be greater.

d = [1, 2, 3] e = [1, 2, 3, 4] f = [1, 2, 3] if d < e                    # false   puts "d is less than e" end if d == f   puts "d equals f"         # Prints "d equals f" end

Because the Array class does not mix in the Comparable module, the usual operators <, >, <=, and >= are not defined for arrays. But we can easily define them ourselves if we choose:

class Array   def <(other)     (self <=> other) == -1   end   def <=(other)     (self < other) or (self == other)   end   def >(other)     (self <=> other) == 1   end   def >=(other)     (self > other) or (self == other)   end end

However, it would be easier simply to include Comparable ourselves:

class Array   include Comparable end

Having defined these new operators, we can use them as you would expect:

if a < b   print "a < b"       # Prints "a < b" else   print "a >= b" end if d < e   puts "d < e"        # Prints "d < e" end

It is conceivable that comparing arrays will result in the comparison of two elements for which <=> is undefined or meaningless. The following code results in a runtime error (a TypeError) because the comparison 3 <=> "x" is problematic:

g = [1, 2, 3] h = [1, 2, "x"] if g < h             # Error!   puts "g < h"       # No output end

However, in case you are still not confused, equal and not-equal will still work in this case. This is because two objects of different types are naturally considered to be unequal even though we can't say which is greater or less than the other.

if g != h             # No problem here.   puts "g != h"       # Prints "g != h" end

Finally, it is conceivable that two arrays containing mismatched data types will still compare with < and > operators. In the case shown here, we get a result before we stumble across the incomparable elements:

i = [1, 2, 3] j = [1, 2, 3, "x"] if i < j             # No problem here.   puts "i < j"       # Prints "i < j" end

8.1.5. Sorting an Array

The easiest way to sort an array is to use the built-in sort method as follows:

words = %w(the quick brown fox) list = words.sort  # ["brown", "fox", "quick", "the"] # Or sort in place: words.sort!        # ["brown", "fox", "quick", "the"]

This method assumes that all the elements in the array are comparable with each other. A mixed array such as [1, 2, "tHRee", 4] normally gives a type error.

In a case like this one, you can use the block form of the same method call. The following example assumes that there is at least a to_s method for each element (to convert it to a string):

a = [1, 2, "three", "four", 5, 6] b = a.sort {|x,y| x.to_s <=> y.to_s} # b is now [1, 2, 5, 6, "four", "three"]

Of course, such an ordering (in this case, depending on ASCII) may not be meaningful. If you have such a heterogeneous array, you may want to ask yourself why you are sorting it in the first place or why you are storing different types of objects.

This technique works because the block returns an integer (-1, 0, or 1) on each invocation. When a -1 is returned, meaning that x is less than y, the two elements are swapped. Thus, to sort in descending order, we could simply swap the order of the comparison:

x = [1, 4, 3, 5, 2] y = x.sort {|a,b| b <=> a}    # [5, 4, 3, 2, 1]

The block style can also be used for more complex sorting. Let's suppose that we want to sort a list of book and movie titles in a certain way: We ignore case, we ignore spaces entirely, and we want to ignore certain kinds of embedded punctuation. Here we present a simple example. (Both English teachers and computer programmers will be equally confused by this kind of alphabetizing.)

titles = ["Starship Troopers",           "A Star is Born",           "Star Wars",           "Star 69",           "The Starr Report"] sorted = titles.sort do |x,y|   # Delete leading articles   a = x.sub(/^(a |an |the )/i, "")   b = y.sub(/^(a |an |the )/i, "")   # Delete spaces and punctuation   a.delete!(" .,-?!")   b.delete!(" .,-?!")   # Convert to uppercase   a.upcase!   b.upcase!   # Compare a and b   a <=> b end # sorted is now: # [ "Star 69", "A Star is Born", "The Starr Report" #   "Starship Troopers", "Star Wars"]

This example is not overly useful, and it could certainly be written more compactly. The point is that any arbitrarily complex set of operations can be performed on two operands to compare them in a specialized way. (Note, however, that we left the original operands untouched by manipulating copies of them.) This general technique can be useful in many situationsfor example, sorting on multiple keys or sorting on keys that are computed at runtime.

In more recent versions of Ruby, the Enumerable module has a sort_by method (which of course is mixed into Array). This is important to understand.

The sort_by method employs what Perl people call a Schwartzian transform (after Randal Schwartz). Rather than sort based on the array elements themselves, we apply some kind of function or mapping and sort based on those.

For a contrived example, imagine that we had a list of files and wanted to sort them by size. A straightforward way would be as follows:

files = files.sort {|x,y| File.size(x) <=> File.size(y) }

However, there are two problems here. First, this seems slightly verbose. We should be able to condense it a little.

Second, this results in multiple disk accesses, each of which is a fairly expensive operation (compared to simple in-memory operations). To make it worse, we are doing many of these operations more than once.

Using sort_by addresses both these issues. Here is the "right" way to do it:

files = files.sort_by {|x| File.size(x) }

In the preceding example each key is computed only once and is then stored internally as part of a key/data tuple. For smaller arrays, this may actually decrease efficiency, but it may be worth the more readable code.

There is no sort_by! method. However, you could always write your own.

What about a multikey sort? Imagine that we had an array of objects and needed to sort them based on three of their attributes: name, age, and height. The fact that arrays are comparable means that this technique will work:

list = list.sort_by {|x| [x.name, x.age, x.height] }

Of course, you're not limited to simple array elements like these. Any arbitrary expression could be an array element.

8.1.6. Selecting from an Array by Criteria

Sometimes we want to locate an item or items in an array much as though we were querying a table in a database. There are several ways to do this; the ones we outline here are all mixed in from the Enumerable module.

The detect method will find at most a single element. It takes a block (into which the elements are passed sequentially) and returns the first element for which the block evaluates to a value that tests true.

x = [5, 8, 12, 9, 4, 30] # Find the first multiple of 6 x.detect {|e| e % 6 == 0 }         # 12 # Find the first multiple of 7 x.detect {|e| e % 7 == 0 }         # nil

Of course, the objects in the array can be of arbitrary complexity, as can the test in the block.

The find method is a synonym for detect; the method find_all is a variant that returns multiple elements as opposed to a single element. Finally, the method select is a synonym for find_all:

# Continuing the above example... x.find {|e| e % 2 == 0}            # 8 x.find_all {|e| e % 2 == 0}        # [8, 12, 4, 30] x.select {|e| e % 2 == 0}          # [8, 12, 4, 30]

The grep method invokes the relationship operator (that is, the case equality operator) to match each element against the pattern specified. In its simplest form, it returns an array containing the matched elements. Because the relationship operator (===) is used, the so-called pattern need not be a regular expression. (The name grep, of course, comes from the UNIX world, historically related to the old editor command g/re/p.)

a = %w[January February March April May] a.grep(/ary/)      # ["January, "February"] b = [1, 20, 5, 7, 13, 33, 15, 28] b.grep(12..24)     # [20, 13, 15]

There is a block form that effectively transforms each result before storing it in the array; the resulting array contains the return values of the block rather than the values passed into the block:

# Continuing above example... # Let's store the string lengths a.grep(/ary/) {|m| m.length}     # [7, 8] # Let's square each value b.grep(12..24) {|n| n*n}         # {400, 169, 225}

The reject method is complementary to select. It excludes each element for which the block evaluates to TRue. The in-place mutator reject! is also defined:

c = [5, 8, 12, 9, 4, 30] d = c.reject {|e| e % 2 == 0}    # [5, 9] c.reject! {|e| e % 3 == 0} # c is now [5, 8, 4]

The min and max methods may be used to find the minimum and maximum values in an array. There are two forms of each of these; the first form uses the "default" comparison, whatever that may be in the current situation (as defined by the <=> method). The second form uses a block to do a customized comparison.

a = %w[Elrond Galadriel Aragorn Saruman Legolas] b = a.min                                 # "Aragorn" c = a.max                                 # "Saruman" d = a.min {|x,y| x.reverse <=> y.reverse} # "Elrond" e = a.max {|x,y| x.reverse <=> y.reverse} # "Legolas"

Suppose we wanted to find the index of the minimum or maximum element (assuming it is unique). We could use the index method for tasks such as this:

# Continuing above example... i = a.index a.min     # 2 j = a.index a.max     # 3

This same technique can be used in other similar situations. However, if the element is not unique, the first one in the array will naturally be the one found.

8.1.7. Using Specialized Indexing Functions

The internals of a language handle the mapping of array indices to array elements through an indexing function. Because the methods that access array elements can be overridden, we can in effect index an array in any way we want.

For example, in the following code we implement an array that is one-based rather than zero-based:

class Array2 < Array   def [](index)     if index>0       super(index-1)     else       raise IndexError     end   end   def []=(index,obj)     if index>0       super(index-1,obj)     else       raise IndexError     end   end end x = Array2.new x[1]=5 x[2]=3 x[0]=1  # Error x[-1]=1 # Error

Note that the negative indexing (from the end of an array) is disallowed here. And be aware that if this were a real-life solution, there would be other changes to make, such as the slice method and others. But this gives the general idea.

A similar approach can be used to implement multidimensional arrays (as we will see in section 8.1.11 "Using Multidimensional Arrays").

It is also possible to implement something like a triangular matrix as shown here. This is like a special case of a two-dimensional array in which element x,y is always the same as element y,x (so that only one need be stored). This is sometimes useful, for example, in storing an undirected graph (as we will see toward the end of this chapter).

class TriMatrix   def initialize     @store = []   end   def [](x,y)     if x > y       index = (x*x+x)/2 + y       @store[index]     else       raise IndexError     end   end   def []=(x,y,v)     if x > y       index = (x*x+x)/2 + y       @store[index] = v     else       raise IndexError     end   end end t = TriMatrix.new t[3,2] = 1 puts t[3,2]  # 1 puts t[2,3]  # IndexError

In the preceding example we chose to implement the matrix so that the row number must be greater than or equal to the column number; we also could have coded it so that the same pair of indices simply mapped to the same element. These design decisions will depend on your use of the matrix.

It would have been possible to inherit from Array, but we thought this solution was easier to understand. The indexing formula is a little complex, but 10 minutes with pencil and paper should convince anyone it is correct. Enhancements probably could be made to this class to make it truly useful, but we will leave that to you.

Also, it is possible to implement a triangular matrix as an array containing arrays that increase in size as the row number gets higher. This is similar to what we do in section 8.1.11 "Using Multidimensional Arrays." The only tricky part would be to make sure that a row does not accidentally grow past its proper size.

8.1.8. Implementing a Sparse Matrix

Sometimes we need an array that has very few elements defined; the rest of its elements can be undefined (or more often zero). This so-called sparse matrix has historically been a waster of memory that led people to seek indirect ways of implementing it.

Of course, in most cases, a Ruby array will suffice because modern architectures typically have large amounts of memory. An unassigned element will have the value nil, which takes only a few bytes to store.

But on the other hand, assigning an array element beyond the previous bounds of the array also creates all the nil elements in between. For example, if elements 0 through 9 are defined, and we suddenly assign to element 1000, we have in effect caused elements 10 through 999 to spring into being as nil values. If this is unacceptable, you might consider another alternative.

The alternative we have to suggest, however, does not involve arrays at all. If we really need a sparse matrix, a hash might be the best solution. See section 8.2.14 "Using a Hash As a Sparse Matrix".

8.1.9. Using Arrays As Mathematical Sets

Most languages do not directly implement sets (Pascal being one exception). But Ruby arrays have some features that make them usable as sets. We'll present these here and add a few of our own.

Recent versions of Ruby include a Set class in the standard library. If your needs are more than incidental, consider using Set objects rather than actual arrays. These are covered in Chapter 9, "More Advanced Data Structures."

An array isn't a perfect fit for representing a set because an array can have duplicate entries. If you specifically want to treat the array as a set, you can remove these (using uniq or uniq!).

The two most basic operations performed on sets are union and intersection. These are accomplished by the | ("or") and & ("and") operators, respectively. In accordance with the idea that a set does not contain duplicates, any duplicates will be removed. (This may be contrary to your expectations if you are used to array union and intersection operations in some other language.)

a = [1, 2, 3, 4, 5] b = [3, 4, 5, 6, 7] c = a | b            # [1, 2, 3, 4, 5, 6, 7] d = a & b            # [3, 4, 5] # Duplicates are removed... e = [1, 2, 2, 3, 4] f = [2, 2, 3, 4, 5] g = e & f            # [2, 3, 4]

The concatenation operator + can be used for set union, but it does not remove duplicates.

The - method is a "set difference" operator that produces a set with all the members of the first set except the ones appearing in the second set. (See section 8.1.12, "Finding Elements in One Array But Not Another.")

a = [1, 2, 3, 4, 5] b = [4, 5, 6, 7] c = a - b            # [1, 2, 3] # Note that the extra items 6 and 7 are irrelevant.

To "accumulate" sets it is possible to use the |= operator; as expected, a |= b simply means a = a | b. Likewise &= can progressively "narrow down" the elements of a set.

There is no exclusive-or defined for arrays, but we can make our own easily. In set terms, this corresponds to elements that are in the union of two sets but not in the intersection.

class Array   def ^(other)     (self | other) - (self & other)   end end x = [1, 2, 3, 4, 5] y = [3, 4, 5, 6, 7] z = x ^ y            # [1, 2, 6, 7]

To check for the presence of an element in a set, we can use the method include? or member? (essentially an alias mixed in from Comparable):

x = [1, 2, 3] if x.include? 2   puts "yes"     # Prints "yes" else   puts "no" end

Of course, this is a little backward from what we are used to in mathematics, where the operator resembling a Greek epsilon denotes set membership. It is backward in the sense that the set is on the left rather than on the right; we are not asking "Is this element in this set?" but rather "Does this set contain this element?"

Many people will not be bothered by this at all. But if you are used to Pascal or Python (or you have ingrained mathematical inclinations), you may want a different way. We present an option in the following code:

class Object   def in(other)     other.include? self   end end x = [1, 2, 3] if 2.in x   puts "yes"     # Prints "yes" else   puts "no" end

I personally have made a Ruby Change Request (RCR 241) proposing an in operator for Ruby. This would be similar to the operator in Pascal or Python or even SQL.

The idea has its advantages (and in is already a reserved word), but it has been received with mixed popularity. It may or may not ever be part of Ruby.

Now let's look at subsets and supersets. How do we tell whether a set is a subset or a superset of another? There are no built-in methods, but we can do it this way:

class Array   def subset?(other)     self.each  do |x|       if !(other.include? x)         return false       end     end     true   end   def superset?(other)     other.subset?(self)   end end a = [1, 2, 3, 4] b = [2, 3] c = [2, 3, 4, 5] flag1 = c.subset? a     # false flag2 = b.subset? a     # true flag3 = c.superset? b   # true

Note that we've chosen the "natural" orderingthat is, x.subset? y means "Is x a subset of y?" rather than vice versa.

To detect the null set (or empty set), we simply detect the empty array. The empty? method does this.

The concept of set negation (or complement) depends on the concept of a universal set. Because in practical terms this varies from one application or situation to another, the best way is the simplest: Define the universe; then do a set difference.

universe = [1, 2, 3, 4, 5, 6] a = [2, 3] b = universe - a   # complement of a = [1, 4, 5, 6]

Of course, if you really felt the need, you could define a unary operator (such as - or ~) to do this.

You can iterate through a set just by iterating through the array. The only difference is that the elements will come out in order, which you may not want. To iterate randomly, see section 8.1.18, "Iterating Over an Array".

Finally, we may sometimes want to compute the powerset of a set. This is simply the set of all possible subsets (including the null set and the original set itself). Those familiar with discrete math or especially combinatorics will see that there must be 2ⁿ of these subsets. We can generate the powerset as follows:

class Array   def powerset     num = 2**size     ps = Array.new(num, [])     self.each_index do |i|       a = 2**i       b = 2**(i+1) - 1       j = 0       while j < num-1         for j in j+a..j+b           ps[j] += [self[i]]         end         j += 1       end     end     ps   end end x = [1, 2, 3] y = x.powerset # y is now: #   [[], [1], [2], [1,2], [3], [1,3], [2,3], [1,2,3]]

8.1.10. Randomizing an Array

Sometimes we want to scramble an array into a random order. The first example that might come to mind is a card game, but there are other circumstances such as presenting a list of questions to a user in a random order.

To accomplish this task, we can use the rand in the Kernel module. The following is one way to do this:

class Array   def randomize     self.sort_by { rand }    # sort by a key which is a   end                        #  random number   def randomize!     self.replace(self.randomize)   end end x = [1, 2, 3, 4, 5] y = x.randomize      # [3, 2, 4, 1, 5] x.randomize!         # x is now [3, 5, 4, 1, 2]

Because of the nature of the sorting, there is probably a slight bias introduced here. In most cases it won't matter.

If we wanted simply to pick an array element at random (without disallowing duplicates), we could do that as follows:

class Array   def pick_random     self[rand(self.length)]   end end

Finally, we should remember that any time we are using rand, we can generate a predictable sequence (for example, in unit testing) simply by seeding with a known seed using srand (see section 5.28 "Generating Random Numbers").

8.1.11. Using Multidimensional Arrays

If you want to use multidimensional arrays for numerical purposes, there is an excellent library in the Ruby Application Archive called NArray (by Masahiro Tanaka). If you want to use matrices, there is also the matrix.rb standard library as mentioned in section 5.10, "Matrix Manipulation".

The following example presents a way of handling multidimensional arrays by overloading the [] and []= methods to map elements onto a nested array. The class Array3 presented here handles three-dimensional arrays in a rudimentary fashion, but it is far from complete:

class Array3   def initialize      @store = [[[]]]   end   def [](a,b,c)     if @store[a]==nil ||        @store[a][b]==nil ||        @store[a][b][c]==nil       return nil     else       return @store[a][b][c]     end   end   def []=(a,b,c,x)     @store[a] = [[]] if @store[a]==nil     @store[a][b] = [] if @store[a][b]==nil     @store[a][b][c] = x   end end x = Array3.new x[0,0,0] = 5 x[0,0,1] = 6 x[1,2,3] = 99 puts x[1,2,3]

Note that all we really gain here is the convenience of a "comma" notation [x,y,z] instead of the more C-like [x][y][z]. If the C-style notation is acceptable to you, you can just use nested arrays in Ruby. Another minor benefit is the prevention of the situation in which nil is the receiver for the bracket method.

8.1.12. Finding Elements in One Array But Not Another

This is simpler in Ruby than in many languages. It is a simple "set difference" problem:

text = %w[the magic words are squeamish ossifrage] dictionary = %w[an are magic the them these words] # Find potential misspellings unknown = text - dictionary   # ["squeamish", "ossifrage"]

8.1.13. Transforming or Mapping Arrays

The collect method (part of Enumerable) is a useful tool that proves to be a time and labor saver in many circumstances. If you are a Smalltalk programmer, this may be more intuitive than if you come from a C background.

This method simply operates on each element of an array in some arbitrary way to produce a new array. In other words, it "maps" an array onto another array (hence the synonym map).

x = %w[alpha bravo charlie delta echo foxtrot] # Get the initial letters a = x.collect {|w| w[0..0]}        # %w[a b c d e f] # Get the string lengths b = x.collect {|w| w.length}       # [5, 5, 7, 5, 4, 7] # map is just an alias c = x.map {|w| w.length}           # [5, 5, 7, 5, 4, 7]

The in-place variant collect! (or map!) is also defined.

x.collect! {|w| w.upcase} # x is now %w[ALPHA BRAVO CHARLIE DELTA ECHO FOXTROT] x.map! {|w| w.reverse} # x is now %w[AHPLA OVARB EILRAHC ATLED OHCE TORTXOF]

8.1.14. Removing `nil` Values from an Array

The compact method (or its in-place version compact!) removes nil values from an array, leaving the rest untouched:

a = [1, 2, nil, 3, nil, 4, 5] b = a.compact     # [1, 2, 3, 4, 5] a.compact!        # a is now [1, 2, 3, 4, 5]

8.1.15. Removing Specific Array Elements

It is easy to delete elements from a Ruby array, and there are many ways to do it. If you want to delete one specific element by index, delete_at is a good way:

a = [10, 12, 14, 16, 18] a.delete_at(3)              # Returns 16 # a is now [10, 12, 14, 18] a.delete_at(9)              # Returns nil (out of range)

If you want to delete all instances of a certain piece of data, delete will do the job. It returns the value of the objects deleted or nil if it was not found:

b = %w(spam spam bacon spam eggs ham spam) b.delete("spam")            # Returns "spam" # b is now ["bacon", "eggs", "ham"] b.delete("caviar")          # Returns nil

The delete method also accepts a block. This may be a little counterintuitive; all that happens is that the block is evaluated (potentially performing a wide range of operations) if the object is not found, and the value of the block is returned.

c = ["alpha", "beta", "gamma", "delta"] c.delete("delta") { "Nonexistent" } # Returns "delta" (block is never evaulated) c.delete("omega") { "Nonexistent" } # Returns "Nonexistent"

The delete_if passes every element into the supplied block and deletes the elements for which the block evaluates to TRue. It behaves similarly to reject!, except that the latter can return nil when the array remains unchanged.

email = ["job offers", "greetings", "spam", "news items"] # Delete four-letter words email.delete_if {|x| x.length==4 } # email is now ["job offers", "greetings", "news items"]

The slice! method accesses the same elements as slice but deletes them from the array as it returns their values:

x = [0, 2, 4, 6, 8, 10, 12, 14, 16] a = x.slice!(2)                          # 4 # x is now [0, 2, 6, 8, 10, 12, 14, 16] b = x.slice!(2,3)                        # [6, 8, 10] # x is now [0, 2, 12, 14, 16] c = x.slice!(2..3)                       # [12, 14] # x is now [0, 2, 16]

The shift and pop methods can be used for deleting array elements. (For more about their intended uses, see section 9.2, "Working with Stacks and Queues".)

x = [1, 2, 3, 4, 5] x.pop                   # Delete the last element # x is now [1, 2, 3, 4] x.shift                 # Delete the first element # x is now [2, 3, 4]

The reject method takes a block and produces a new array without the elements for which the block returns true:

arr = [1,2,3,4,5,6,7,8] odd = arr.reject {|x| x % 2 == 0 }     # [1,3,5,7]

Finally, the clear method deletes all the elements in an array. It is equivalent to assigning an empty array to the variable but is marginally more efficient:

x = [1, 2, 3] x.clear # x is now []

8.1.16. Concatenating and Appending onto Arrays

Frequently we want to append an element or another array onto an array. You can do this in many ways with a Ruby array.

The "append" operator << appends an object onto an array; the return value is the array itself, so that these operations can be "chained."

x = [1, 5, 9] x << 13        # x is now [1, 5, 9, 13] x << 17 << 21  # x is now [1, 5, 9, 13, 17, 21]

Similar to the append operator are the unshift and push methods, which add to the beginning and end of an array, respectively. See section 8.1.17 "Using an Array As a Stack or Queue" later in this chapter.

Arrays may be concatenated with the concat method or by using the + and += operators:

x = [1,2] y = [3,4] z = [5,6] b = y + z         # [3,4,5,6] b += x            # [3,4,5,6,1,2] z.concat y        # z is now [5,6,3,4]

Bear in mind that += always creates a new object. Also bear in mind that << appends a new array element (which may itself be an array).

a = [1,2] b = [3,4] a += b           # [1,2,3,4] a = [1,2] b = [3,4] a <<  b          # [1,2,[3,4]] a = [1,2] b = [3,4] a = a.concat(b)  # [1,2,3,4]

8.1.17. Using an Array As a Stack or Queue

The basic stack operations are push and pop, which add and remove items at the end of an array. The basic queue operations are shift (which removes an item from the beginning of an array) and unshift (which adds an element to the beginning). The append operator << can also be used to add an item to the end of an array (basically a synonym for push).

Don't get confused. The shift and unshift methods work on the beginning of an array; the push, pop, and << methods work on the end.

For a better discussion of this topic, see section 9.2, "Working with Stacks and Queues".

8.1.18. Iterating Over an Array

The Array class has the standard iterator each as is to be expected. However, it also has other useful iterators.

The reverse_each method iterates in reverse order. It is equivalent to using reverse and then each, but it is faster.

words = %w(Son I am able she said) str = "" words.reverse_each { |w| str += "#{w} "} # str is now "said she able am I Son "

If we only want to iterate over the indices, we can use each_index. Saying x.each_index is equivalent to saying (0..(x.size-1)).each (that is, iterating over the range of indices).

The iterator each_with_index (mixed in from Comparable) passes both the element and the index into the block.

x = ["alpha", "beta", "gamma"] x.each_with_index do |x,i|   puts "Element #{i} is #{x}" end # Produces three lines of output

Suppose that we wanted to iterate over an array in random order. We show here the iterator random_each (which simply invokes the randomize method from section 8.1.10, "Randomizing an Array".

class Array # Assumes we have defined randomize   def random_each     temp = self.randomize     temp.each {|x| yield x}   end end dwarves = %w(Sleepy Dopey Happy Sneezy Grumpy Bashful Doc) list = "" dwarves.random_each {|x| list += "#{x} "} # list is now: # "Bashful Dopey Sleepy Happy Grumpy Doc Sneezy " # (Your mileage may vary.)

8.1.19. Interposing Delimiters to Form a String

Frequently we will want to insert delimiters in between array elements in a "fencepost" fashion; that is, we want to put delimiters between the elements but not before the first one or after the last one. The method join will do this, as will the * operator.

been_there = ["Veni", "vidi", "vici."] journal = been_there.join(", ")        # "Veni, vidi, vici." letters = ["Phi","Mu","Alpha"] musicians = letters.join(" ")          # "Phi Mu Alpha" people = ["Bob","Carol","Ted","Alice"] movie = people * " and " # movie is now "Bob and Carol and Ted and Alice"

Note that if we really need to treat the last element differently, perhaps by inserting the word and, we can do it manually:

list = %w[A B C D E F] with_commas = list[0..-2]*", " + ", and " + list[-1] # with_commas is now "A, B, C, D, E, and F"

8.1.20. Reversing an Array

To reverse the order of an array, use the reverse or reverse! methods:

inputs = ["red", "green", "blue"] outputs = inputs.reverse          # ["green","blue","red"] priorities = %w(eat sleep code) priorities.reverse!               # ["code","sleep","eat"]

8.1.21. Removing Duplicate Elements from an Array

If you want to remove duplicate elements from an array, the uniq method (or its in-place mutator uniq!) will do the job:

breakfast = %w[spam spam eggs ham eggs spam] lunch = breakfast.uniq   # ["spam","eggs","ham"] breakfast.uniq!          # breakfast has changed now

8.1.22. Interleaving Arrays

Suppose that we wanted to take two arrays and "interleave" them so that the new array contains smaller arrays of paired elements from each of the two original ones. Recent versions of Ruby have the zip method in Enumerable.

a = [1, 2, 3, 4] b = ["a", "b", "c", "d"] c = a.zip(b) # c is now [[1,"a"], [2,"b"], [3,"c"], [4,"d"]] # Use flatten if you want to eliminate the nesting d = c.flatten # d is now [1, "a", 2, "b", 3, "c", 4, "d"]

8.1.23. Counting Frequency of Values in an Array

There is no count method for arrays as there is for strings (to count the occurrences of each data item). So we create one here:

class Array   def count     k=Hash.new(0)     self.each{|x| k[x]+=1 }     k   end end meal = %w[spam spam eggs ham eggs spam] items = meal.count # items is {"ham" => 1, "spam" => 3, "eggs" => 2} spams = items["spam"]   # 3

Note that a hash is returned. No pun intended.

8.1.24. Inverting an Array to Form a Hash

An array is used to associate an integer index with a piece of data. But what if you want to invert that associationthat is, associate the data with the index, producing a hash? The following method will do just that:

class Array   def invert     h={}     self.each_with_index{|x,i| h[x]=i}     h   end end a = ["red","yellow","orange"] h = a.invert     # {"orange"=>2, "yellow"=>1, "red"=>0}

8.1.25. Synchronized Sorting of Multiple Arrays

Suppose we wanted to sort an array, but we had other arrays that corresponded with this one on an element-for-element basis. In other words, we don't want to get them out of sync.

The solution presented here sorts an array and gathers the resulting set of indices. The list of indices (itself an array) can be applied to any other array to put its elements in the same order.

class Array   def sort_index     d=[]     self.each_with_index{|x,i| d[i]=[x,i]}     if block_given?       d.sort {|x,y| yield x[0],y[0]}.collect{|x| x[1]}     else       d.sort.collect{|x| x[1]}     end   end   def sort_with(ord=[])     return nil if self.length!=ord.length     self.values_at(*ord)   end end a = [21, 33, 11, 34, 36, 24, 14] b = a.sort_index a2 = a.sort_with(b) c = a.sort_index {|x,y| x%2 <=> y%2 } a3 = a.sort_with(c) p a         # [21, 33, 11, 34, 36, 24, 14] p b         # [2, 6, 0, 5, 1, 3, 4] p a2        # [11, 14, 21, 24, 33, 34, 36] p c         # [6, 5, 4, 3, 2, 1, 0] p a3        # [14, 24, 36, 34, 11, 33, 21]

8.1.26. Establishing a Default Value for New Array Elements

When an array grows and new (unassigned) elements are created, these default to nil values:

a = Array.new a[0]="x" a[3]="y" # a is now ["x", nil, nil, "y"]

What if we want to set those new elements to some other value? As a specific instance of a general principle, we offer the ZArray class, which defaults new unassigned elements to 0:

class ZArray < Array   def [](x)     if x > size       for i in size+1..x         self[i]=0       end     end     v = super(x)   end   def []=(x,v)     max = size     super(x,v)     if size - max > 1       (max..size-2).each do |i|         self[i] = 0       end     end   end end num = ZArray.new num[1] = 1 num[2] = 4 num[5] = 25 # num is now [0, 1, 4, 0, 0, 25]