Defining the FileSort Demo


Visual Basic .NET Unleashed
By Paul Kimmel
Table of Contents
Chapter 13.  Creating a Console Application

Defining the FileSort Demo

Console applications these days are typically simple utility applications. Dynamic link libraries and executables with interfaces are probably better candidates for more complicated applications, although no limitation is imposed by the language except for the absence of graphical interface controls. With the addition of multithreading, the complexity and variety of console applications is likely to increase rather than the opposite . However, in keeping with the utility theme, we will create a multithreaded sort utility application.

The sample application adds an additional twist you might find helpful. There is a lot of legacy data still in existence. Big corporations with demanding data needs are working with data that is moved back and forth between mainframes with older data formats and newer client/server systems. Whether this is the best way to manage the data stored in legacy systems or not is questionable; it is a practical problem. The solution assumes that it may be beneficial to provide a flexible modus for defining how the data is sorted.

The FileSort.sln sample application allows you to pass multiple files at the command line and sort each file on its own thread. In addition to sorting the files on individual threads and avoiding shorter files waiting on longer files, the sample application demonstrates how to implement a comparator apart from the core solution.

Using TextReader

FileSort.sln validates multiple files passed to the application on the command line. For each valid file, FileSort reads the file contents into an ArrayList and uses the shared Sort method of System.Array to sort the data. It is reasonable to assume that you might want to perform different kinds of comparisons on the data and order the data in ascending or descending order based on predetermined criteria. It is the comparison process that guides the ordering of data, so we will use the IComparer interface rather than implementing alternate sorts to manage the sort order.

To get data into the ArrayList, we will assume that the data is represented by lines of text, and we will use the methods of the ArrayList and TextReader classes to read and cache data in text files. The complete solution to the FileSort.sln is shown in Listing 13.7, and excerpts describing the various aspects of the solution are described in the subsections that follow.

Listing 13.7 The complete listing to the multithreaded FileSort.sln sample program, demonstrating sorting lines of text and implementing the IComparer interface
  1:  Imports System.IO  2:   3:  Public Class Main  4:   5:  Private Overloads Shared Function ValidCommandLine(_  6:  ByVal CommandLine As String) As Boolean  7:  Return Sorter.Valid(CommandLine)  8:  End Function  9:   10:  Private Shared Function GetEnumerator(_  11:  ByVal CommandLine As String) As IEnumerator  12:  Return Command.Split(", ".ToCharArray()).GetEnumerator  13:  End Function  14:   15:  Private Overloads Shared Sub ProcessEach(_  16:  ByVal CommandLine As String)  17:  ProcessEach(GetEnumerator(CommandLine))  18:  End Sub  19:   20:  Private Overloads Shared Sub ProcessEach(_  21:  ByVal Enumerator As IEnumerator)  22:  While (Enumerator.MoveNext())  23:  If (ValidCommandLine(Enumerator.Current)) Then  24:  Sorter.Sort(Enumerator.Current)  25:  End If  26:  End While  27:  End Sub  28:   29:  Public Shared Sub Process(ByVal CommandLine As String)  30:  If (Help.WantsHelp(CommandLine)) Then  31:  Help.Show()  32:  Else  33:  ProcessEach(CommandLine)  34:  End If  35:  End Sub  36:   37:  Public Shared Sub Main()  38:  Process(Command())  39:  End Sub  40:  End Class  41:   42:  Public Class Sorter  43:  Private FFileName As String  44:  Private FData As ArrayList  45:   46:  Public Property FileName() As String  47:  Get  48:  Return FFileName  49:  End Get  50:  Set(ByVal Value As String)  51:  FFileName = Value  52:  End Set  53:  End Property  54:   55:  Public Sub New(ByVal AFileName As String)  56:  FData = New ArrayList()  57:  FFileName = AFileName  58:  End Sub  59:   60:  Public Sub Read()  61:  Dim Reader As TextReader = File.OpenText(FFileName)  62:   63:  Try  64:  While Reader.Peek <> -1  65:  FData.Add(Reader.ReadLine)  66:  End While  67:   68:  Finally  69:  Reader.Close()  70:  End Try  71:   72:  End Sub  73:   74:  Private Sub WriteElapsedTime(ByVal Elapsed As Double)  75:  Debug.WriteLine(String.Format("Elapsed milliseconds: {0}  ", Elapsed))  76:  End Sub  77:   78:  Public Sub TimedSort()  79:  Dim Start As Double = Timer  80:  Sort()  81:  WriteElapsedTime(Timer - Start)  82:  End Sub  83:   84:  Public Sub Sort()  85:  FData.Sort(New StringComparer())  86:  End Sub  87:   88:  Private Function TempFileName() As String  89:  Return "..\ " & Path.GetFileName(Path.GetTempFileName)  90:  End Function  91:   92:  Public Sub Write()  93:  Dim Writer As TextWriter = File.CreateText(TempFileName())  94:  Dim Enumerator As IEnumerator = FData.GetEnumerator  95:   96:  Try  97:  While Enumerator.MoveNext  98:  Writer.WriteLine(Enumerator.Current)  99:  End While  100:   101:  Finally  102:  Writer.Close()  103:  End Try  104:   105:  End Sub  106:   107:  Private Sub Run()  108:  Read()  109:  TimedSort()  110:  Write()  111:  End Sub  112:   113:  Public Shared Sub Sort(ByVal FileName As String)  114:  Dim Instance As New Sorter(FileName)  115:  Dim Thread As New Threading.Thread(AddressOf Instance.Run)  116:  Thread.Start()  117:  End Sub  118:   119:  Public Shared Function Valid(ByVal CommandLine As String) As Boolean  120:  Return System.IO.File.Exists(CommandLine)  121:  End Function  122:   123:  End Class  124:   125:  Public Class Help  126:  Private Shared Function ContainsHelpSwitch(_  127:  ByVal CommandLine As String) As Boolean  128:   129:  Return CommandLine.IndexOf("-?") > -1 Or _  130:  CommandLine.ToUpper.IndexOf("-H") > -1 Or _  131:  CommandLine.ToUpper.IndexOf("-HELP") > -1 Or CommandLine = ""  132:   133:  End Function  134:   135:  Public Shared Function WantsHelp(ByVal CommandLine As String) As Boolean  136:  Return ContainsHelpSwitch(CommandLine)  137:  End Function  138:   139:  Public Shared Function Usage() As String  140:  Return _  141:  "Usage: filesort filename.txt  [-h,-?-help] " & vbCrLf & _  142:  "Sorts lines of text in file" & vbCrLf & _  143:  "-h-?-help - Displays the help message" & vbCrLf  144:  End Function  145:   146:  Public Shared Sub Show()  147:  Console.WriteLine(Usage())  148:  End Sub  149:  End Class  150:   151:  Public Class StringComparer  152:  Implements IComparer  153:   154:  Public Function Compare(ByVal x As Object, _  155:  ByVal y As Object) As Integer _  156:  Implements System.Collections.IComparer.Compare  157:  Try  158:  Return String.Compare(x.ToString.Substring(0, 3), _  159:  y.ToString.Substring(0, 3))  160:  Catch  161:  Return String.Compare(x.ToString, y.ToString)  162:  End Try  163:  End Function  164:  End Class 


If you type this code from scratch, remember to make a public, shared subroutine named Main the startup subroutine. Visual Studio .NET should prompt you for this information.

The Read method is an instance method that assumes that FFileName contains a valid filename and creates an instance of a TextReader initializing the TextReader with the FFileName field value. The resource protection block, represented by the Try.. Finally..End Try block on lines 62 to 70, ensures that the reader is closed. In the listing, we are using the Finally clause to ensure that the TextReaderthe fileis closed; see line 69. The While..End While loop ensures that we read all lines. Reader.Peek <> -1 checks for the end of the file. Peek returns the next character in the stream; when Peek returns -1, we have reached the end of the file. The ReadLine method reads a single line of text and FData.Add inserts the line into the ArrayList object represented by the name FData.

Lines 60 through 72 demonstrate one means of reading an entire text file. For practical solutions you might need to devise a cleverer means of caching, sorting, and managing text data. The sample solution shown worked reasonably well on a 250MB file, containing approximately 2.7 million rows of exported data on a PC with 256MB of RAM.

There are several shared members in the System.IO.File class. The online help documentation and experimentation will allow you to discover several of these methods. The shared method File.OpenText returns a TextReader instance. To write the sorted file back to disk, we will use an instance of TextWriter.

Using TextWriter

System.IO.File.CreateText returns an instance of the TextWriter class. TextWriter objects are used to write text to disk files. The TextWriter class is an abstract classhas the MustInherit modifierand CreateText returns an instance of a subclass of TextWriter, StreamWriter.

  92:  Public Sub Write()  93:  Dim Writer As TextWriter = File.CreateText(TempFileName())  94:  Dim Enumerator As IEnumerator = FData.GetEnumerator  95:   96:  Try  97:  While Enumerator.MoveNext  98:  Writer.WriteLine(Enumerator.Current)  99:  End While  100:   101:  Finally  102:  Writer.Close()  103:  End Try  104:   105:  End Sub 

Writing the sorted file back to disk is the reverse of the read process. An excerpt from Listing 13.7 on lines 92 to 105 retrieves an instance of a TextWriter by calling the shared method File.CreateText and calling the System.IO.Path.GetTempFilename to ensure that we create a unique output file. The enumerator returned by ArrayList. GetEnumerator is used to iterate each element of the ArrayList and the TextWriter. WriteLine method is used to write the line of text to the new output file. The result is that we have the original file and the new sorted file.

Getting a Temporary File

The System.IO.Path class contains the GetTempFileName method that returns a unique filename. Like many methods in .NET, the CLR implements capabilities that used to be accessed by API calls as members of classes defined in suitable namespaces. The Path class defines several shared methods described in Table 13.2.

Table 13.2. Shared Members of the System.IO.Path Class
Member Description
Shared Fields
AltDirectorySeparatorChar Platform-specific alternate directory separator
DirectorySeparatorChar Platform-specific directory separator
InvalidPathChars Platform-specific list of invalid path characters
PathSeparator Platform-specific directory separator
VolumeSeparatorChar Platform-specific volume separator
Shared Methods
ChangeExtension Changes the file extension
Combine Combines two file paths
GetDirectoryName Returns path without the filename
GetExtension Returns a file extension
GetFileName Returns the filename, including the extension
GetFileNameWithoutExtension Returns a filename without the extension
GetFullPath Expands the path argument to the full path
GetPathRoot Returns the root of the path
GetTempFileName Returns a unique filename and creates an empty file on the disk
GetTempPath Returns the path to your system temporary directory
HasExtension Indicates if the path contains a file extension
IsPathRooted Indicates if the path contains the root

As demonstrated by the members in Table 13.2, the CLR plays the role that Windows API declarations play in VB6. If you are looking for a method that you used to include using an API method, check namespaces and classes in the CLR.

Using the IEnumerator Interface

The section "Using Array Methods" in Chapter 3 introduced the IEnumerator interface. The IEnumerator interface introduces three members, MoveNext, Current, and Reset, that offer a consistent interface for iterating the elements of a class that contains multiple objects, like the ArrayList.

Any class that implements the IEnumerator class can be passed to methods that expect an IEnumerator and can be used in a loop (see lines 22 to 26 and 97 to 99 in Listing 13.7) consistent with the examples in Listing 13.7. Additionally, it is important to remember that methods and classes are reusable, whereas statements are not. Any class that implements the IEnumerator interface can be passed as a value to a method that has an IEnumerator parameter.

Sorting an ArrayList

The ArrayList implements an instance Sort method. You can sort the entire array, a subsection of the array, or pass an IComparer object to describe how the comparison works. The three overloaded ArrayList.Sort procedure headers follow:

 Overloads Overridable Public Sub Sort() Overloads Overridable Sub Sort(IComparer) Overloads Overridable Public Sort(Integer, Integer, IComparer) 

The Sort method implements the quick sort algorithm, which has O( n log 2 n ) characteristics. The value n represents the number of elements and the order-of-magnitude is a mathematical description of the performance of the algorithm. Simplistically, a quick sort takes n multiplied by the log base 2 of n units of time. Quick sorts decay to roughly the performance characteristics of a bubble sort, which is O( n 2 ), or n-squared, if the elements are already sorted.

The first implementation of Sort uses the default implementation of the IComparer interface in each element in the ArrayList. The second implementation allows you to pass an object that implements IComparer, and the third method allows you to selectively sort a range of elements using a custom class that implements IComparer.

Implementing the IComparer Interface

Listing 13.7 uses a custom class StringComparer that implements the IComparer interface to sort the lines of text based on field offsets. The only method in the IComparer interface is a Compare function that returns a negative number to indicate if argument x is less than argument y, 0 if the arguments are equal, and a positive number if argument x is greater than y. However, you can implement the comparison any way you want to. In Listing 13.7, an exception block is used to prevent failure. The first three characters are compared as a string. If either x or y represents a string less than three characters long, a straight string comparison is performed. Lines 158 and 159 demonstrate how to compare a substring.

Keep in mind that x and y can represent any kind of data. Listing 13.7 is designed with the foreknowledge that the class will be comparing strings. Line 85 of Listing 13.7 demonstrates how to construct an instance of the StringComparer class and pass it to the ArrayList.Sort method.


Visual BasicR. NET Unleashed
Visual BasicR. NET Unleashed
Year: 2001
Pages: 222 © 2008-2017.
If you may any questions please contact us: