You have a plain-text string in a comma-delimited format. You need to parse this string, either to build a data structure or to perform some operation on the data and write it back out.
The built-in csv library can parse most common character-delimited formats. The FasterCSV library, available as the fastercsv gem, improves on csvs performance and interface. Ill show you both, but I recommend fastercsv unless you can use any software at all outside the standard library.
CSV::Reader.parse and FasterCSV.parse work the same way: they accept a string or an open file as an argument, and yield each parsed row of the comma-delimited file as an array. The csv yields a Row object that acts like an array full of Column objects. FasterCSV just yields an array of strings.
require csv primary_colors = "red,green,blue red,yellow,blue" CSV::Reader.parse(primary_colors) { |row| row.each { |cell| puts cell }} # red # green # blue # red # yellow # blue require ubygems require faster_csv shakespeare = %{Sweet are the uses of adversity,As You Like It "We few, we happy few",Henry V "Seems, madam! nay it is; I know not ""seems.""",Hamlet} FasterCSV.parse(shakespeare) { |row| puts "#{row[0]} -- #{row[1]}"} # Sweet are the uses of adversity -- As You Like It # We few, we happy few -- Henry V # Seems, madam! nay it is; I know not "seems." -- Hamlet
Comma-delimited formats are among the most basic portable file formats. Unfortunately, they e also among the least standardized. There are many different formats, and some are internally inconsistent.
FasterCSV and the csv library can parse every comma-delimited format, but they will parse common formats like the one used by Microsoft Excel, and they e your best tool for making sense of the myriad.
FasterCSV and csv both model a comma-delimited file as a nested array of strings. The csv librarys CSV class uses Row objects and Column objects instead of arrays and strings, but its the same idea. The terminology is from the spreadsheet worldunderstand-ably, since a CSV file is a common way of portably storing spreadsheet data.
The complications begin when the spreadsheet cells themselves contain commas or newlines. The standard way to handle this when exporting to comma-delimited format is to surround those cells with double quotes. Then the question becomes what to do with cells that contain double-quote characters. Both Ruby CSV libraries assume that double-quote characters are escaped by doubling, turning each " into "", as in the Hamlet quotation:
%{"Seems, madam! nay it is; I know not ""seems.""",Hamlet}
If you e certain that there are no commas or newlines embedded in your data, and thus no need for quote handling, you can use String#split to parse delimited records more quickly than csv. To output to this format, you can use Array#join:
def parse_delimited_naive(input, fieldsep=,, rowsep=" ") input.split(rowsep).inject([]) do |arr, line| arr << line.split(fieldsep) end end def join_delimited_naive(structure, fieldsep=,, rowsep=" ") rows = structure.inject([]) do |arr, parsed_line| arr << parsed_line.join(fieldsep) end rows.join(rowsep) end parse_delimited_naive("1,2,3,4 5,6,7,8") # => [["1", "2", "3", "4"], ["5", "6", "7", "8"]] join_delimited_naive(parse_delimited_naive("1,2,3,4 5,6,7,8")) # => "1,2,3,4 5,6,7,8" parse_delimited_naive(1;2;3;4|5;6;7;8, ;, |) # => [["1", "2", "3", "4"], ["5", "6", "7", "8"]] parse_delimited_naive(1,"2,3",4) # => [["1", ""2", "3"", "4"]]
This is not recommended unless you wrote all the relevant code yourself, or can manually inspect the code as well as the dataset. Just because you haven seen any quoted cells yet doesn mean there won be any in the future. When in doubt, use csv or fastercsv. Handwritten CSV generators and parsers are a leading cause of bad data.
To create a comma-delimited file, open an output file with CSV.open or FasterCSV.open, and append a series of arrays to the resulting file-like object. Every array you append will be converted to a comma-delimited row in the destination file.
data = [[1,2,3],[A,B,C],[do, e,mi]] writer = FasterCSV.open(first3.csv, w) data.each { |x| writer << x } writer.close puts open(first3.csv).read() # 1,2,3 # A,B,C # do,re,mi data = [] FasterCSV.foreach(first3.csv) { |row| data << row } data # => [["1", "2", "3"], ["A", "B", "C"], ["do", "re", "mi"]]
Strings
Numbers
Date and Time
Arrays
Hashes
Files and Directories
Code Blocks and Iteration
Objects and Classes8
Modules and Namespaces
Reflection and Metaprogramming
XML and HTML
Graphics and Other File Formats
Databases and Persistence
Internet Services
Web Development Ruby on Rails
Web Services and Distributed Programming
Testing, Debugging, Optimizing, and Documenting
Packaging and Distributing Software
Automating Tasks with Rake
Multitasking and Multithreading
User Interface
Extending Ruby with Other Languages
System Administration