Section 2.9. Formatting a String


2.8. Tokenizing a String

The split method parses a string and returns an array of tokens. It accepts two parameters, a delimiter and a field limit (which is an integer).

The delimiter defaults to whitespace. Actually, it uses $; or the English equivalent $FIELD_SEPARATOR. If the delimiter is a string, the explicit value of that string is used as a token separator.

s1 = "It was a dark and stormy night." words = s1.split          # ["It", "was", "a", "dark", "and",                           #  "stormy", "night"] s2 = "apples, pears, and peaches" list = s2.split(", ")     # ["apples", "pears", "and peaches"] s3 = "lions and tigers and bears" zoo = s3.split(/ and /)   # ["lions", "tigers", "bears"]


The limit parameter places an upper limit on the number of fields returned, according to these rules:

  1. If it is omitted, trailing null entries are suppressed.

  2. If it is a positive number, the number of entries will be limited to that number (stuffing the rest of the string into the last field as needed). Trailing null entries are retained.

  3. If it is a negative number, there is no limit to the number of fields, and trailing null entries are retained.

These three rules are illustrated here:

str = "alpha,beta,gamma,," list1 = str.split(",")     # ["alpha","beta","gamma"] list2 = str.split(",",2)   # ["alpha", "beta,gamma,,"] list3 = str.split(",",4)   # ["alpha", "beta", "gamma", ","] list4 = str.split(",",8)   # ["alpha", "beta", "gamma", "", ""] list5 = str.split(",",-1)  # ["alpha", "beta", "gamma", "", ""]


The scan method can be used to match regular expressions or strings against a target string:

str = "I am a leaf on the wind..." # A string is interpreted literally, not as a regex arr = str.scan("a")        # ["a","a","a"] # A regex will return all matches arr = str.scan(/\w+/)      # ["I", "am", "a", "leaf", "on", "the", "wind"] # A block can be specified str.scan(/\w+/) {|x| puts x }


The StringScanner class, from the standard library, is different in that it maintains state for the scan rather than doing it all at once:

require 'strscan' str = "Watch how I soar!" ss = StringScanner.new(str) loop do   word = ss.scan(/\w+/)    # Grab a word at a time   break if word.nil?   puts word   sep = ss.scan(/\W+/)     # Grab next non-word piece   break if sep.nil? end





The Ruby Way(c) Solutions and Techniques in Ruby Programming
The Ruby Way, Second Edition: Solutions and Techniques in Ruby Programming (2nd Edition)
ISBN: 0672328844
EAN: 2147483647
Year: 2004
Pages: 269
Authors: Hal Fulton

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net