Defining the FileSort Demo Console applications these days are typically simple utility applications. Dynamic link libraries and executables with interfaces are probably better candidates for more complicated applications, although no limitation is imposed by the language except for the absence of graphical interface controls. With the addition of multithreading, the complexity and variety of console applications is likely to increase rather than the opposite . However, in keeping with the utility theme, we will create a multithreaded sort utility application. The sample application adds an additional twist you might find helpful. There is a lot of legacy data still in existence. Big corporations with demanding data needs are working with data that is moved back and forth between mainframes with older data formats and newer client/server systems. Whether this is the best way to manage the data stored in legacy systems or not is questionable; it is a practical problem. The solution assumes that it may be beneficial to provide a flexible modus for defining how the data is sorted. The FileSort.sln sample application allows you to pass multiple files at the command line and sort each file on its own thread. In addition to sorting the files on individual threads and avoiding shorter files waiting on longer files, the sample application demonstrates how to implement a comparator apart from the core solution. Using TextReaderFileSort.sln validates multiple files passed to the application on the command line. For each valid file, FileSort reads the file contents into an ArrayList and uses the shared Sort method of System.Array to sort the data. It is reasonable to assume that you might want to perform different kinds of comparisons on the data and order the data in ascending or descending order based on predetermined criteria. It is the comparison process that guides the ordering of data, so we will use the IComparer interface rather than implementing alternate sorts to manage the sort order. To get data into the ArrayList, we will assume that the data is represented by lines of text, and we will use the methods of the ArrayList and TextReader classes to read and cache data in text files. The complete solution to the FileSort.sln is shown in Listing 13.7, and excerpts describing the various aspects of the solution are described in the subsections that follow. Listing 13.7 The complete listing to the multithreaded FileSort.sln sample program, demonstrating sorting lines of text and implementing the IComparer interface1: Imports System.IO 2: 3: Public Class Main 4: 5: Private Overloads Shared Function ValidCommandLine(_ 6: ByVal CommandLine As String) As Boolean 7: Return Sorter.Valid(CommandLine) 8: End Function 9: 10: Private Shared Function GetEnumerator(_ 11: ByVal CommandLine As String) As IEnumerator 12: Return Command.Split(", ".ToCharArray()).GetEnumerator 13: End Function 14: 15: Private Overloads Shared Sub ProcessEach(_ 16: ByVal CommandLine As String) 17: ProcessEach(GetEnumerator(CommandLine)) 18: End Sub 19: 20: Private Overloads Shared Sub ProcessEach(_ 21: ByVal Enumerator As IEnumerator) 22: While (Enumerator.MoveNext()) 23: If (ValidCommandLine(Enumerator.Current)) Then 24: Sorter.Sort(Enumerator.Current) 25: End If 26: End While 27: End Sub 28: 29: Public Shared Sub Process(ByVal CommandLine As String) 30: If (Help.WantsHelp(CommandLine)) Then 31: Help.Show() 32: Else 33: ProcessEach(CommandLine) 34: End If 35: End Sub 36: 37: Public Shared Sub Main() 38: Process(Command()) 39: End Sub 40: End Class 41: 42: Public Class Sorter 43: Private FFileName As String 44: Private FData As ArrayList 45: 46: Public Property FileName() As String 47: Get 48: Return FFileName 49: End Get 50: Set(ByVal Value As String) 51: FFileName = Value 52: End Set 53: End Property 54: 55: Public Sub New(ByVal AFileName As String) 56: FData = New ArrayList() 57: FFileName = AFileName 58: End Sub 59: 60: Public Sub Read() 61: Dim Reader As TextReader = File.OpenText(FFileName) 62: 63: Try 64: While Reader.Peek <> -1 65: FData.Add(Reader.ReadLine) 66: End While 67: 68: Finally 69: Reader.Close() 70: End Try 71: 72: End Sub 73: 74: Private Sub WriteElapsedTime(ByVal Elapsed As Double) 75: Debug.WriteLine(String.Format("Elapsed milliseconds: {0} ", Elapsed)) 76: End Sub 77: 78: Public Sub TimedSort() 79: Dim Start As Double = Timer 80: Sort() 81: WriteElapsedTime(Timer - Start) 82: End Sub 83: 84: Public Sub Sort() 85: FData.Sort(New StringComparer()) 86: End Sub 87: 88: Private Function TempFileName() As String 89: Return "..\ " & Path.GetFileName(Path.GetTempFileName) 90: End Function 91: 92: Public Sub Write() 93: Dim Writer As TextWriter = File.CreateText(TempFileName()) 94: Dim Enumerator As IEnumerator = FData.GetEnumerator 95: 96: Try 97: While Enumerator.MoveNext 98: Writer.WriteLine(Enumerator.Current) 99: End While 100: 101: Finally 102: Writer.Close() 103: End Try 104: 105: End Sub 106: 107: Private Sub Run() 108: Read() 109: TimedSort() 110: Write() 111: End Sub 112: 113: Public Shared Sub Sort(ByVal FileName As String) 114: Dim Instance As New Sorter(FileName) 115: Dim Thread As New Threading.Thread(AddressOf Instance.Run) 116: Thread.Start() 117: End Sub 118: 119: Public Shared Function Valid(ByVal CommandLine As String) As Boolean 120: Return System.IO.File.Exists(CommandLine) 121: End Function 122: 123: End Class 124: 125: Public Class Help 126: Private Shared Function ContainsHelpSwitch(_ 127: ByVal CommandLine As String) As Boolean 128: 129: Return CommandLine.IndexOf("-?") > -1 Or _ 130: CommandLine.ToUpper.IndexOf("-H") > -1 Or _ 131: CommandLine.ToUpper.IndexOf("-HELP") > -1 Or CommandLine = "" 132: 133: End Function 134: 135: Public Shared Function WantsHelp(ByVal CommandLine As String) As Boolean 136: Return ContainsHelpSwitch(CommandLine) 137: End Function 138: 139: Public Shared Function Usage() As String 140: Return _ 141: "Usage: filesort filename.txt [-h,-?-help] " & vbCrLf & _ 142: "Sorts lines of text in file" & vbCrLf & _ 143: "-h-?-help - Displays the help message" & vbCrLf 144: End Function 145: 146: Public Shared Sub Show() 147: Console.WriteLine(Usage()) 148: End Sub 149: End Class 150: 151: Public Class StringComparer 152: Implements IComparer 153: 154: Public Function Compare(ByVal x As Object, _ 155: ByVal y As Object) As Integer _ 156: Implements System.Collections.IComparer.Compare 157: Try 158: Return String.Compare(x.ToString.Substring(0, 3), _ 159: y.ToString.Substring(0, 3)) 160: Catch 161: Return String.Compare(x.ToString, y.ToString) 162: End Try 163: End Function 164: End Class Tip If you type this code from scratch, remember to make a public, shared subroutine named Main the startup subroutine. Visual Studio .NET should prompt you for this information. The Read method is an instance method that assumes that FFileName contains a valid filename and creates an instance of a TextReader initializing the TextReader with the FFileName field value. The resource protection block, represented by the Try.. Finally..End Try block on lines 62 to 70, ensures that the reader is closed. In the listing, we are using the Finally clause to ensure that the TextReaderthe fileis closed; see line 69. The While..End While loop ensures that we read all lines. Reader.Peek <> -1 checks for the end of the file. Peek returns the next character in the stream; when Peek returns -1, we have reached the end of the file. The ReadLine method reads a single line of text and FData.Add inserts the line into the ArrayList object represented by the name FData. Lines 60 through 72 demonstrate one means of reading an entire text file. For practical solutions you might need to devise a cleverer means of caching, sorting, and managing text data. The sample solution shown worked reasonably well on a 250MB file, containing approximately 2.7 million rows of exported data on a PC with 256MB of RAM. There are several shared members in the System.IO.File class. The online help documentation and experimentation will allow you to discover several of these methods. The shared method File.OpenText returns a TextReader instance. To write the sorted file back to disk, we will use an instance of TextWriter. Using TextWriterSystem.IO.File.CreateText returns an instance of the TextWriter class. TextWriter objects are used to write text to disk files. The TextWriter class is an abstract classhas the MustInherit modifierand CreateText returns an instance of a subclass of TextWriter, StreamWriter. 92: Public Sub Write() 93: Dim Writer As TextWriter = File.CreateText(TempFileName()) 94: Dim Enumerator As IEnumerator = FData.GetEnumerator 95: 96: Try 97: While Enumerator.MoveNext 98: Writer.WriteLine(Enumerator.Current) 99: End While 100: 101: Finally 102: Writer.Close() 103: End Try 104: 105: End Sub Writing the sorted file back to disk is the reverse of the read process. An excerpt from Listing 13.7 on lines 92 to 105 retrieves an instance of a TextWriter by calling the shared method File.CreateText and calling the System.IO.Path.GetTempFilename to ensure that we create a unique output file. The enumerator returned by ArrayList. GetEnumerator is used to iterate each element of the ArrayList and the TextWriter. WriteLine method is used to write the line of text to the new output file. The result is that we have the original file and the new sorted file. Getting a Temporary FileThe System.IO.Path class contains the GetTempFileName method that returns a unique filename. Like many methods in .NET, the CLR implements capabilities that used to be accessed by API calls as members of classes defined in suitable namespaces. The Path class defines several shared methods described in Table 13.2. Table 13.2. Shared Members of the System.IO.Path Class
As demonstrated by the members in Table 13.2, the CLR plays the role that Windows API declarations play in VB6. If you are looking for a method that you used to include using an API method, check namespaces and classes in the CLR. Using the IEnumerator InterfaceThe section "Using Array Methods" in Chapter 3 introduced the IEnumerator interface. The IEnumerator interface introduces three members, MoveNext, Current, and Reset, that offer a consistent interface for iterating the elements of a class that contains multiple objects, like the ArrayList. Any class that implements the IEnumerator class can be passed to methods that expect an IEnumerator and can be used in a loop (see lines 22 to 26 and 97 to 99 in Listing 13.7) consistent with the examples in Listing 13.7. Additionally, it is important to remember that methods and classes are reusable, whereas statements are not. Any class that implements the IEnumerator interface can be passed as a value to a method that has an IEnumerator parameter. Sorting an ArrayListThe ArrayList implements an instance Sort method. You can sort the entire array, a subsection of the array, or pass an IComparer object to describe how the comparison works. The three overloaded ArrayList.Sort procedure headers follow: Overloads Overridable Public Sub Sort() Overloads Overridable Sub Sort(IComparer) Overloads Overridable Public Sort(Integer, Integer, IComparer) The Sort method implements the quick sort algorithm, which has O( n log 2 n ) characteristics. The value n represents the number of elements and the order-of-magnitude is a mathematical description of the performance of the algorithm. Simplistically, a quick sort takes n multiplied by the log base 2 of n units of time. Quick sorts decay to roughly the performance characteristics of a bubble sort, which is O( n 2 ), or n-squared, if the elements are already sorted. The first implementation of Sort uses the default implementation of the IComparer interface in each element in the ArrayList. The second implementation allows you to pass an object that implements IComparer, and the third method allows you to selectively sort a range of elements using a custom class that implements IComparer. Implementing the IComparer InterfaceListing 13.7 uses a custom class StringComparer that implements the IComparer interface to sort the lines of text based on field offsets. The only method in the IComparer interface is a Compare function that returns a negative number to indicate if argument x is less than argument y, 0 if the arguments are equal, and a positive number if argument x is greater than y. However, you can implement the comparison any way you want to. In Listing 13.7, an exception block is used to prevent failure. The first three characters are compared as a string. If either x or y represents a string less than three characters long, a straight string comparison is performed. Lines 158 and 159 demonstrate how to compare a substring. Keep in mind that x and y can represent any kind of data. Listing 13.7 is designed with the foreknowledge that the class will be comparing strings. Line 85 of Listing 13.7 demonstrates how to construct an instance of the StringComparer class and pass it to the ArrayList.Sort method. |
Team-Fly |
Top |