Recipe 8.10. Reading a Comma-Separated-Values File into a String Array


Problem

You need to read a CSV file into an array.

Solution

Sample code folder: Chapter 08\ReadCSVFiles

Use the Split() function to parse the file's content to fill an array.

Discussion

Today's computers generally have a lot of memory, which often allows entire files to be read into a single string in one operation. If you have an extremely large CSV file, you might want to read the file one line at a time. In either case, the Split() function provides a great tool for parsing the comma-separated values so they can be copied into an array.

The following code reads the entire file created in the previous recipe into a single string, and then splits this string into an array of strings, lineData, using the newline characters as the split point. Each line is then further split at the comma character separating individual words. If the CSV file contains numbers, this is the point where each "word" of the text from the file could be converted to Double, Integer, or whatever type is appropriate. In this example, however, the words are simply reformatted for display and verification in a message box:

 Dim result As New System.Text.StringBuilder Dim wholeFile As String Dim lineData( ) As String Dim fieldData( ) As String ' ----- Read in the file. Dim filePath As String = _    My.Computer.FileSystem.CurrentDirectory & "\Test.csv" wholeFile = My.Computer.FileSystem.ReadAllText(filePath) ' ----- Process each line. lineData = Split(wholeFile, vbNewLine) 'OR: lineData = wholeFile.Split(New String( ) {vbNewLine}, _ '       StringSplitOptions.None) For Each lineOfText As String In lineData    ' ----- Process each field.    fieldData = lineOfText.Split(",")    For Each wordOfText As String In fieldData       result.Append(wordOfText)       result.Append(Space(1))    Next wordOfText    result.AppendLine( ) Next lineOfText MsgBox(result.ToString( )) 

String objects have a Split() method, and Visual Basic 2005 also provides a Split() function. Notice the commented-out line in the previous code. This line demonstrates how workText can be split using the string's Split() method instead of using the Split() function, and it's useful to compare that line with the line just above it. In both cases linedata is filled with the lines of the file, but the syntax is different for these two Split() variations. With the string Split() method, only individual characters or an array of strings can be designated for the split point. In other words, you'll run into trouble if you try to split the lines in the following way:

 lineData = workText.Split(vbNewLine, StringSplitOptions.None) 

The special constant vbNewLine is actually two characters in length (carriage return and line feed), and the resulting strings will all still contain one of these two characters. It took considerable time and effort to debug the rather strange results when we first encountered this problem. To avoid it, pass an array of multicharacter strings to the string Split() method, as shown in the commented-out line in the code above, or use the Visual Basic 2005 Split() function, which has a simpler syntax and does accept multicharacter strings for the split point. Figure 8-10 shows the result of running the example code.

Figure 8-10. Parsing CSV files into arrays using Split( )


See Also

Recipe 8.9 shows the reverse of this recipe.

Recipe 8.12 discusses the differences between the Split() function and the Split() method in more detail. Also, see Recipe 5.44 for more on the Split() function and method.




Visual Basic 2005 Cookbook(c) Solutions for VB 2005 Programmers
Visual Basic 2005 Cookbook: Solutions for VB 2005 Programmers (Cookbooks (OReilly))
ISBN: 0596101775
EAN: 2147483647
Year: 2006
Pages: 400

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net