Section 6.3. Conclusion

6.2. Ranges

Ranges are fairly intuitive, but they do have a few confusing uses and qualities. A numeric range is one of the simplest:

digits = 0..9 scale1 = 0..10 scale2 = 0...10

The .. operator is inclusive of its endpoint, and the ... is exclusive of its endpoint. (This may seem unintuitive to you; if so, just memorize this fact.) So digits and scale2 shown in the preceding example are effectively the same.

But ranges are not limited to integers or numbers. The beginning and end of a range may be any Ruby object. However, not all ranges are meaningful or useful, as we shall see.

The primary operations you might want to do on a range are to iterate over it, convert it to an array, or determine whether it includes a given object. Let's look at all the ramifications of these and other operations.

6.2.1. Open and Closed Ranges

We call a range closed if it includes its end, and open if it does not:

r1 = 3..6     # closed r2 = 3...6    # open a1 = r1.to_a  # [3,4,5,6] a2 = r2.to_a  # [3,4,5]

There is no way to construct a range that excludes its beginning point. This is arguably a limitation of the language.

6.2.2. Finding Endpoints

The first and last methods return the left and right endpoints of a range. Synonyms are begin and end (which are normally keywords but may be called as methods when there is an explicit receiver).

r1 = 3..6 r2 = 3...6 r1a, r1b = r1.first, r1.last    # 3, 6 r1c, r1d = r1.begin, r1.end     # 3, 6 r2a, r2b = r1.begin, r1.end     # 3, 6

The exclude_end? method tells us whether the endpoint is excluded:

r1.exclude_end?   # false r2.exclude_end?   # true

6.2.3. Iterating Over Ranges

Typically it's possible to iterate over a range. For this to work, the class of the endpoints must define a meaningful succ method.

(3..6).each {|x| puts x }  # prints four lines                            # (parens are necessary)

So far, so good. But be very cautious when dealing with String ranges! That class does define a succ operator, but it is of limited usefulness. You should use this kind of feature only in well-defined, isolated circumstances because the succ method for strings is not defined with exceptional rigor. (It is "intuitive" rather than lexicographic, and thus there are strings that have a successor that is surprising or meaningless.)

r1 = "7".."9" r2 = "7".."10" r1.each {|x| puts x }   # Prints three lines r2.each {|x| puts x }   # Prints no output!

The preceding examples look similar but work differently. The reason lies partly in the fact that in the second range, the endpoints are strings of different length. To our eyes, we expect this range to cover the strings "7", "8", "9", and "10", but what really happens?

When we try to iterate over r2, we start with a value of "7" and enter a loop that terminates when the current value is greater than the right-hand endpoint. But because "7" and "10" are strings, not numbers, they are compared as such. In other words, they are compared lexicographically, and we find that the left endpoint is greater than the right endpoint. So we don't loop at all.

What about floating point ranges? We can construct them, and we can certainly test membership in them, which makes them useful. But we can't iterate over them because there is no succ method.

fr = 2.0..2.2 fr.each {|x| puts x }   # error!

Why isn't there a floating point succ method? It would be theoretically possible to increment the floating point number by epsilon each time. But this would be highly architecture-dependent, it would result in a frighteningly high number of iterations for even "small" ranges, and it would be of limited usefulness.

6.2.4. Testing Range Membership

Ranges are not much good if we can't determine whether an item lies within a given range. As it turns out, the include? method makes this easy:

r1 = 23456..34567 x = 14142 y = 31416 r1.include?(x)      # false r1.include?(y)      # true

The method member? is an alias.

But how does this work internally? How does the interpreter determine whether an item is in a given range? Actually, it makes this determination simply by comparing the item with the endpoints (so that range membership is dependent on the existence of a meaningful <=> operator).

Therefore to say (a..b).include?(x) is equivalent to saying x >= a and x <= b.

Once again, beware of string ranges.

s1 = "2".."5" str = "28" s1.include?(str)    # true (misleading!)

6.2.5. Converting to Arrays

When we convert a range to an array, the interpreter simply applies succ repeatedly until the end is reached, appending each item onto an array that is returned:

r = 3..12 arr = r.to_a     # [3,4,5,6,7,8,9,10,11,12]

This naturally won't work with Float ranges. It may sometimes work with String ranges, but this should be avoided because the results will not always be obvious or meaningful.

6.2.6. Backward Ranges

Does a backward range make any sense? Yes and no. This is a perfectly valid range:

r = 6..3 x = r.begin              # 6 y = r.end                # 3 flag = r.end_excluded?   # false

As you see, we can determine its starting and ending points and whether the end is included in the range. However, that is nearly all we can do with such a range.

arr = r.to_a       # [] r.each {|x| p x}   # No iterations y = 5 r.include?(y)      # false (for any value of y)

Does that mean that backward ranges are necessarily "evil" or useless? Not at all. It is still useful, in some cases, to have the endpoints encapsulated in a single object.

In fact, arrays and strings frequently take "backward ranges" because these are zero-indexed from the left but "minus one"-indexed from the right. Therefore we can use expressions like these:

string = "flowery" str1   = string[0..-2]   # "flower" str2   = string[1..-2]   # "lower" str3   = string[-5..-3]  # "owe" (actually a forward range)

6.2.7. The Flip-Flop Operator

When a range is used in a condition, it is treated specially. This usage of .. is called the flip-flop operator because it is essentially a toggle that keeps its own state.

This trick, apparently originating with Perl, is useful. But understanding how it works takes a little effort.

Imagine we had a Ruby source file with embedded docs between =begin and =end tags. How would we extract and output only those sections? (Our state toggles between "inside" a section and "outside" a section, hence the flip-flop concept.) The following piece of code, while perhaps unintuitive, will work:

loop do   break if eof?   line = gets   puts line if (line=~/=begin/)..(line=~/=end/) end

How can this work? The magic all happens in the flip-flop operator.

First, realize that this "range" is preserving a state internally, but this fact is hidden. When the left endpoint becomes true, the range itself returns true; it then remains true until the right endpoint becomes true, and the range toggles to false.

This kind of feature might be needed in many cases. Some examples are parsing HTML, parsing section-oriented config files, selecting ranges of items from lists, and so on.

However, I personally don't like the syntax. Others are also dissatisfied with it, perhaps even Matz himself. This behavior may be removed from Ruby in the future. But I'll show a convenient way to get the same functionality.

What's wrong with the flip-flop? This is my own opinion.

First, in the preceding example, take a line with the value =begin. A reminder: The =~ operator does not return true or false as we might expect; it returns the position of the match (a Fixnum) or nil if there was no match. So then the expressions in the range evaluate to 0 and nil, respectively.

However, if we try to construct a range from 0 to nil, it gives us an error because it is nonsensical:

range = 0..nil    # error!

Furthermore, bear in mind that in Ruby, only false and nil evaluate to false; everything else evaluates as true. Then a range ordinarily would not evaluate as false.

puts "hello" if x..y # Prints "hello" for any valid range x..y

And again, suppose we stored these values in variables and then used the variables to construct the range. This doesn't work; the test is always true.

loop do   break if eof?   line = gets   start = line=~/=begin/   stop = line=~/=end/   puts line if start..stop end

What if we put the range itself in a variable? This doesn't work either. Once again, the test is always true.

loop do   break if eof?   line = gets   range = (line=~/=begin/)..(line=~/=end/)   puts line if range end

To understand this, we have to understand that the entire range (with both endpoints) is re-evaluated each time the loop is run, but the internal state is also factored in. The flip-flop operator is therefore not a true range at all. The fact that it looks like a range but is not is considered a bad thing by some.

Finally, think of the endpoints of the flip-flop. They are re-evaluated every time, but this re-evaluation cannot be captured in a variable that can be substituted. In effect, the flip-flop's endpoints are like procs. They are not values; they are code. The fact that something that looks like an ordinary expression is really a proc is also undesirable.

Having said all that, the functionality is still useful. Can we write a class that encapsulates this function without being so cryptic and magical? As it turns out, this is not difficult. In Listing 6.1, we introduce a simple class called transition, which mimics the behavior of the flip-flop.

Listing 6.1. The `transition` Class

class Transition   A, B = :A, :B   T, F = true, false           # state,p1,p2  => newstate, result   Table = {[A,F,F]=>[A,F], [B,F,F]=>[B,T],            [A,T,F]=>[B,T], [B,T,F]=>[B,T],            [A,F,T]=>[A,F], [B,F,T]=>[A,T],            [A,T,T]=>[A,T], [B,T,T]=>[A,T]}   def initialize(proc1, proc2)     @state = A     @proc1, @proc2 = proc1, proc2     check?   end   def check?     p1 = @proc1.call ? T : F     p2 = @proc2.call ? T : F     @state, result = *Table[[@state,p1,p2]]     return result   end end

In the transition class, we use a simple state machine to manage transitions. We initialize it with a pair of procs (the same ones used in the flip-flop). We do lose a little convenience in that any variables (such as line) used in the procs must already be in scope. But we now have a solution with no "magic" in it, where all expressions behave as they do any other place in Ruby.

Here's a slight variant on the same solution. Let's change the initialize method to take a proc and two arbitrary expressions:

def initialize(var,flag1,flag2)   @state = A   @proc1 = proc { flag1 === var.call }   @proc2 = proc { flag2 === var.call }   check? end

The case equality operator is used to check the relationship of the starting and ending flags with the variable. The variable is wrapped in a proc because we pass this value in only once; we need to be able to re-evaluate it. Because a proc is a closure, this is not a problem.

Here is how we use the new code version:

line = nil trans = Transition.new(proc {line}, /=begin/, /=end/) loop do      break if eof?      line = gets   puts line if trans.check? end

I do recommend an approach like this, which is more explicit and less magical. This will be especially important when the flip-flop operator does in fact go away.

6.2.8. Custom Ranges

Let's look at an example of a range made up of some arbitrary object. Listing 6.2 shows a simple class to handle Roman numerals.

Listing 6.2. A Roman Numeral Class

class Roman   include Comparable   I,IV,V,IX,X,XL,L,XC,C,CD,D,CM,M =     1, 4, 5, 9, 10, 40, 50, 90, 100, 400, 500, 900, 1000   Values = %w[M CM D CD C XC L XL X IX V IV I]   def Roman.encode(value)     return "" if self == 0     str = ""     Values.each do |letters|       rnum = const_get(letters)       if value >= rnum         return(letters + str=encode(value-rnum))       end     end     str   end   def Roman.decode(rvalue)     sum = 0     letters = rvalue.split('')     letters.each_with_index do |letter,i|       this = const_get(letter)       that = const_get(letters[i+1]) rescue 0       op = that > this ? :- : :+       sum = sum.send(op,this)     end     sum   end   def initialize(value)     case value       when String         @roman = value         @decimal = Roman.decode(@roman)       when Symbol         @roman = value.to_s         @decimal = Roman.decode(@roman)       when Numeric         @decimal = value         @roman = Roman.encode(@decimal)     end   end   def to_i     @decimal   end   def to_s     @roman   end   def succ     Roman.new(@decimal+1)   end   def <=>(other)     self.to_i <=> other.to_i   end end def Roman(val)   Roman.new(val) end

I'll cover a few highlights of this class first. It can be constructed using a string or a symbol (representing a Roman numeral) or a Fixnum (representing an ordinary Hindu-Arabic decimal number). Internally, conversion is performed, and both forms are stored. There is a "convenience method" called Roman, which simply is a shortcut to calling the Roman.new method. The class-level methods encode and decode handle conversion to and from Roman form, respectively.

For simplicity, I haven't done any error checking. I also assume that the Roman letters are uppercase.

The to_i method naturally returns the decimal value, and the to_s method predictably returns the Roman form. We define succ to be the next Roman numberfor example, Roman(:IV).succ would be Roman(:V).

We implement the comparison operator by comparing the decimal equivalents in a straightforward way. We do an include of the Comparable module so that we can get the less-than and greater-than operators (which depend on the existence of the comparison method <=>).

Notice the gratuitous use of symbols in this fragment:

op = that > this ? :- : :+ sum = sum.send(op,this)

In the preceding fragment, we're actually choosing which operation (denoted by a symbol) to performaddition or subtraction. This code fragment is just a short way of saying:

if that > this   sum -= this else   sum += this end

The second fragment is longer but arguably clearer.

Because this class has both a succ method and a full set of relational operators, we can use it in a range. The following sample code demonstrates this:

require 'roman' y1 = Roman(:MCMLXVI) y2 = Roman(:MMIX) range = y1..y2                # 1966..2009 range.each {|x| puts x}       # 44 lines of output epoch = Roman(:MCMLXX) range.include?(epoch)         # true doomsday = Roman(2038) range.include?(doomsday)      # false Roman(:V) == Roman(:IV).succ  # true Roman(:MCM) < Roman(:MM)      # true