Problem
You want to process each character of a string individually.
Solution
If you're processing an ASCII document, then each byte corresponds to one character. Use String#each_byte to yield each byte of a string as a number, which you can turn into a one-character string:
'foobar'.each_byte { |x| puts "#{x} = #{x.chr}" } # 102 = f # 111 = o # 111 = o # 98 = b # 97 = a # 114 = r
Use String#scan to yield each character of a string as a new one-character string:
'foobar'.scan( /./ ) { |c| puts c } # f # o # o # b # a # r
Discussion
Since a string is a sequence of bytes, you might think that the String#each method would iterate over the sequence, the way Array#each does. But String#each is actually used to split a string on a given record separator (by default, the newline):
"foo bar".each { |x| puts x } # foo # bar
The string equivalent of Array#each method is actually each_byte. A string stores its characters as a sequence of Fixnum objects, and each_bytes yields that sequence.
String#each_byte is faster than String#scan, so if you're processing an ASCII file, you might want to use String#each_byte and convert to a string every number passed into the code block (as seen in the Solution).
String#scan works by applying a given regular expression to a string, and yielding each match to the code block you provide. The regular expression /./ matches every character in the string, in turn.
If you have the $KCODE variable set correctly, then the scan technique will work on UTF-8 strings as well. This is the simplest way to sneak a notion of "character" into Ruby's byte-based strings.
Here's a Ruby string containing the UTF-8 encoding of the French phrase "ça va":
french = "xc3xa7a va"
Even if your terminal can't properly display the character "ç", you can see how the behavior of String#scan changes when you make the regular expression Unicodeaware, or set $KCODE so that Ruby handles all strings as UTF-8:
french.scan(/./) { |c| puts c } # # # a # # v # a french.scan(/./u) { |c| puts c } # ç # a # # v # a $KCODE = 'u' french.scan(/./) { |c| puts c } # ç # a # # v # a
Once Ruby knows to treat strings as UTF-8 instead of ASCII, it starts treating the two bytes representing the "ç" as a single character. Even if you can't see UTF-8, you can write programs that handle it correctly.
See Also
Strings
Numbers
Date and Time
Arrays
Hashes
Files and Directories
Code Blocks and Iteration
Objects and Classes8
Modules and Namespaces
Reflection and Metaprogramming
XML and HTML
Graphics and Other File Formats
Databases and Persistence
Internet Services
Web Development Ruby on Rails
Web Services and Distributed Programming
Testing, Debugging, Optimizing, and Documenting
Packaging and Distributing Software
Automating Tasks with Rake
Multitasking and Multithreading
User Interface
Extending Ruby with Other Languages
System Administration