8.2. Using Sequential FilesIn addition to being accessed for information, sequential files are regularly updated by modifying certain pieces of data, removing some records, and adding new records. These tasks can be performed most efficiently if the files are first sorted. Sorting Sequential FilesThe records of a sequential file can be sorted on any field by first reading the data into an array of structures and then sorting on an individual element. Example 1. |
The following program sorts the sequential file YOB.TXT of the previous section by year of birth: Structure Individual Dim name As String Dim yearBorn As Integer End Structure Private Sub btnSort_Click(...) Handles btnSort.Click 'Sort data from YOB.TXT file by year of birth Dim numPeople As Integer numPeople = NumberOfRecords("YOB.TXT") Dim person(numPeople - 1) As Individual ReadData(person, numPeople) SortData(person, numPeople) ShowData(person, numPeople) WriteData(person, numPeople) End Sub Function NumberOfRecords(ByVal filespec As String) As Integer Dim name As String, yearBorn As Integer Dim n As Integer 'Used to count records n = 0 Dim sr As IO.StreamReader = IO.File.OpenText(filespec) Do While (sr.Peek <> -1) name = sr.ReadLine yearBorn = CInt(sr.ReadLine) n += 1 'Increase n by 1 Loop sr.Close() Return n End Function Sub ReadData(ByRef person() As Individual, ByVal numPeople As Integer) 'Read data from file into arrays Dim sr As IO.StreamReader = IO.File.OpenText("YOB.TXT") For index As Integer = 0 To numPeople - 1 person(index).name = sr.ReadLine person(index).yearBorn = CInt(sr.ReadLine) Next sr.Close() End Sub Sub SortData(ByRef person() As Individual, ByVal numPeople As Integer) 'Bubble sort arrays by year of birth For passNum As Integer = 1 To numPeople - 1 For index As Integer = 1 To numPeople - passNum If person(index - 1).yearBorn > person(index).yearBorn Then SwapData(person, index - 1) [Run, and then click the button.]
|
One common way to store data in a text file is to place each record on a single line with the fields separated by commas. This design is known as CSV format. (Note: CSV is an abbreviation for Comma Separated Values.) For example, the YOB.TXT file of the previous section would appear as follows in CSV format:
Barbra,1942 Ringo,1940 Sylvester,1946
Let's refer to the text file design we have been using as LSV format, where LSV stands for Line Separated Values. Also, let's make the convention of placing the prefix csv in front of the name of every file using CSV format. For instance, the preceding file would be named "csvYOB.TXT".
Visual Basic has a function called Split that facilitates working with CSV formatted files. Split can convert a line containing commas into a String array where the 0th element contains the text preceding the first comma, the 1st element contains the text between the first and second commas, ..., and the last element contains the text following the last comma. For instance, suppose the String array employees() has been declared without an upper bound, and the String variable line has the value "Bob, 23.50, 45". Then the statement
employees = line.Split(","c)
sets the size of employees() to 3 and sets employees(0) = "Bob", employees(1) = "23.50", and employees(2) = "45". In the previous line of code, the character comma is called the delimiter for the Split function, and the letter c specifies that the comma have data type Character instead of String. (If Option Strict is Off, the letter c can be omitted.) Any character can be used as a delimiter. If no character is specified, the Split function will use the space character as delimiter.
The following program illustrates the use of the Split function: Private Sub btnRead_Click(...) Handles btnRead.Click Dim stateData(), line As String line = "California, 1850, Sacramento, Eureka" stateData = line.Split(","c) For i As Integer = 0 To stateData.GetUpperBound(0) stateData(i) = stateData(i).Trim 'Get rid of extraneous spaces lstOutput.Items.Add(stateData(i)) Next End Sub [Run, and then click the button. The following is displayed in the list box.] California 1850 Sacramento Eureka |
Sequential files created with Visual Basic 6.0 are usually in CSV format. However, they can be converted easily to LSV format if desired.
The following program converts any CSV file to an LSV file and then displays the contents of the new file. We will assume that the file to be converted is located in the Debug subfolder for the program. Private Sub btnConvert_Click(...) Handles btnConvert.Click Dim line, fields(), fromFile, toFile As String Dim sr As IO.StreamReader Dim sw As IO.StreamWriter fromFile = InputBox("Name of original file:", "Convert CSV to LSV") toFile = InputBox("Name of converted file:", "Convert CSV to LSV") sr = IO.File.OpenText(fromFile) sw = IO.File.CreateText(toFile) Do While (sr.Peek <> -1) line = sr.ReadLine fields = line.Split(","c) For i As Integer = 0 To fields.GetUpperBound(0) sw.WriteLine(fields(i).Trim) Next Loop sr.Close() sw.Close() sr = IO.File.OpenText(toFile) Do While (sr.Peek <> -1) lstFile.Items.Add(sr.ReadLine) Loop sr.Close() End Sub [Run, press the button, and respond to the two requests with csvSTATES.TXT and STATES.TXT. Assume that the file csvSTATES.TXT contains the two lines "California, 1850, Sacramento, Eureka" and "New York, 1788, Albany, Excelsior". The following will be displayed in the list box.] California 1850 Sacramento Eureka New York 1788 Albany Excelsior |
The reverse of the Split function is the Join function, which concatenates the elements of a string array into a string containing the elements separated by a specified delimiter. For instance, the code
Dim greatLakes() As String = _ {"Huron", "Ontario", "Michigan", "Erie", "Superior"} Dim lakes As String lakes = Join(greatLakes, ",") txtOutput.Text = lakes
produces the output
Huron,Ontario,Michigan,Erie,Superior
In Section 7.2, we considered an algorithm for merging two ordered arrays. This same algorithm can be applied to merging two ordered files.
Suppose you have two ordered LSV files (possibly with certain items appearing in both files), and you want to merge them into a third ordered file (without duplications). The technique for creating the third file is as follows:
Open the two ordered files for input, and open a third file for output.
Try to get an item of data from each file.
Repeat the following steps until an item of data is not available in one of the files:
(a). | If one item precedes the other, write it into the third file and try to get another item of data from its file. |
(b). | If the two items are identical, write one into the third file and try to get another item of data from each of the two ordered files. |
At this point, an item of data has most likely been retrieved from one of the files and not yet written to the third file. In this case, write that item and all remaining items in that file to the third file.
Close the three files.
The following program merges two ordered files of numbers into a third ordered file:
Private Sub btnProceed_Click(...) Handles btnProceed.Click 'Merge two ordered files Dim file1, file2, file3 As String Dim have1data, have2data As Boolean Dim num1, num2 As Double Dim recCount As Integer 'Number of records in merged file |
Suppose a small real estate company stores its sales data for a year in a sequential file in which each record contains four fields: month of sale, day of sale (1 through 31), address, and price. Typical data for the sales of the first quarter of a year are shown in Figure 8.4. The records are ordered by date of sale.
Month | Day | Address | Price |
January | 9 | 102 Elm Street | $203,000 |
January | 20 | 1 Main Street | $315,200 |
January | 25 | 5 Maple Street | $123,450 |
February | 15 | 1 Center Street | $100,000 |
February | 23 | 2 Vista Drive | $145,320 |
March | 15 | 205 Rodeo Circle | $389,100 |
Figure 8.5 shows the output of a program that displays the total sales for the quarter year, with a subtotal for each month.
A program to produce the output of Figure 8.5 must calculate a subtotal at the end of each month. The variable holding the month triggers a subtotal whenever its value changes. Such a variable is called a control variable, and each change of its value is called a break.
The following program produces the output of Figure 8.5. The data of Figure 8.4 are stored in the sequential file HOMESALE.TXT. The program allows for months with no sales. Because monthly subtotals will be displayed, the variable holding the current month is an appropriate control variable. Private Sub btnCreateReport_Click(...) Handles btnCreateReport.Click 'Display home sales by month Dim currentMonth As String = "", newMonth As String = "" Dim address As String = "", subtotalText As String = "" Dim dayNum As Integer, doneFlag As Boolean Dim price, monthTotal, yearTotal As Double Dim fmtStr As String = "{0,8} {1,-5}{2,-17}{3,10:C0}" Dim sr As IO.StreamReader = IO.File.OpenText("HOMESALE.TXT") doneFlag = False 'Flag to indicate end of list Do While (Not doneFlag) If (sr.Peek <> -1) Then newMonth = sr.ReadLine dayNum = CInt(sr.ReadLine) address = sr.ReadLine price = CDbl(sr.ReadLine) Else doneFlag = True 'End of list End If If (newMonth <> currentMonth) Or doneFlag Then 'Ctrl break processing If currentMonth <> "" Then 'Don't print subtotal before 1st month lstReport.Items.Add("") subtotalText = "Subtotal for " & currentMonth & ": " lstReport.Items.Add(" " & subtotalText & _ FormatCurrency(monthTotal, 0)) lstReport.Items.Add("") End If currentMonth = newMonth monthTotal = 0 End If |
In the examples of this and the previous section, the files to be processed have been opened and closed within a single procedure. However, the solution to some programming problems requires that a file be opened just once the instant the program is run and stay open until the program is terminated. This is easily accomplished by opening the file in the form's Load event procedure and coding the Close method and End statement in the click event procedure for a button labeled "Quit."
1. | The program in Example 4 contains three Do loops. Explain why at most one of the last two loops will be executed. Under what circumstances will neither of the last two loops be executed? |
2. | Modify the program in Example 4 so that duplicate items will be repeated in the merged file. |
In Exercises 1 through 4, determine the output produced by the lines of code.
1. | Dim allVowelWords(), line As String line = "abstemious,dialogue,facetious,sequoia,education" allVowelWords = line.Split(","c) txtOutput.Text = allVowelWords(2) |
2. | Dim orderWords() As String 'Letters in alphabetical order Dim line As String = "bijoux, biopsy, almost" orderWords = line.Split(","c) txtOutput.Text = orderWords(orderWords.GetUpperBound(0)).Trim |
3. | Dim notes() As String = {"A","B","C","D","E","F","G"} txtOutput.Text = Join(notes, ",") |
4. | Dim line As String = "I came, I saw, I conquered" Dim temp() As String temp = line.Split(","c) txtOutput.Text = Join(temp, ",") |
Exercises 5 through 8 are related. They create and maintain the sequential file AVERAGE.TXT to hold batting averages of baseball players.
5. | Suppose the season is about to begin. Compose a program to create the sequential file containing the name of each player, his times at bat, and his number of hits. The program should allow the user to type a name into a text box and then click a button to add a record to the file. The times at bat and number of hits initially should be set to 0. (Hint: Open the file for Output in the form's Load event procedure and Close the file when a "Quit" button is clicked.) |
6. | Each day, the statistics from the previous day's games should be used to update the file. Write a program to read the records one at a time and allow the user to enter the number of times at bat and the number of hits in yesterday's game for each player in appropriate text boxes on a form. When a button is clicked, the program should update the file by adding these numbers to the previous figures. Hint: Open files in the form's Load event procedure. Close the files and end the program when all data have been processed. |
7. | Several players are added to the league. Compose a program to update the file. |
8. | Compose a program to sort the file AVERAGE.TXT with respect to batting averages and display the players with the top 10 batting averages. Hint: The file should be read once to determine the number of players and again to load the players into an array. |
9. | Write a program to convert any LSV file into a CSV file. Note: In addition to supplying the filenames, the user also must provide the number of fields in each record of the LSV file. |
10. | The file csvAgeAtInaugural.TXT contains the name and age at inaugural for each of the first 43 U.S. Presidents. (The first record is "George Washington, 57".) Write a program that requests an age as input and then displays in a list box the names of all U.S. Presidents of that age at their inaugurals. |
11. | Each record of the file csvUSSTATES.TXT has five fields: state name, abbreviation, when it entered the union, area (in square miles), population in 2000. The file is ordered by when the states entered the union. (The first record is "Delaware, DE, Dec. 1787, 2489, 759000".) Write a program to create a new file containing the same records, but ordered alphabetically by the state's abbreviation. The program also should display the contents of the new file, neatly formatted into columns, in a list box. |
12. | A sentence is called a chain-link sentence if the last two letters of each word are the same as the first two letters of the next word, for instance, "The head administrator organized education on online networks." Write a program that accepts a sentence as input and determines whether it is a chain-link sentence. Hint: Use the space character as a delimiter for the Split method. To be robust, you can first remove commas by executing the statement sentence = sentence.Replace(",",""), which replaces all commas in sentence with the empty string. Test the program with the sentence "Broadcast station, once certified, educates estimable legions." |
Exercises 13 and 14 refer to the ordered file BLOCK.TXT containing the names of people on your block and the ordered file TIMES.TXT containing the names of all people who subscribe to the New York Times.
13. | Write a program that creates a file consisting of the names of all people on your block who subscribe to the New York Times. | |||||||||
14. | Write a program that creates a file consisting of the names of all New York Times subscribers who do not live on your block. | |||||||||
15. | Suppose that a file of positive integers is in ascending order. Write a program to determine the maximum number of times any integer is repeated in the file. (For instance, if the entries in the file are 5, 5, 6, 6, 6, and 10, then the output is 3.) | |||||||||
16. | Suppose that each record of the file SALES.TXT contains a salesperson's name and the dollar amount of a sale, and the records are ordered by the names. Write a program to display the name, number of sales, and average sale amount for each salesperson. For instance, if the first eight lines of the file contain the data: Adams, 123.45, Adams, 432.15, Brown, 89.95, Cook, 500.00, then the first two entries of the output would be
| |||||||||
17. | An elementary school holds a raffle to raise funds. Suppose that each record of the file RAFFLE.TXT contains a student's grade (1 through 6), name, and the number of raffle tickets sold and that the records are ordered by grade. Write a program using a control break to display the number of tickets sold by each grade and the total number of tickets sold. | |||||||||
18. | Multiple Control Breaks. Suppose the sorted sequential file CENSUS.TXT contains names of all residents of a state, where each record consists of two lines of the form "last Name", "first Name". Write a program to determine, in one pass through the file, the most common last name and most common full name. (Note: In the unlikely event of a tie, the program should display the first occurring name.) For instance, the output in the list box might be as follows: The most common last name is Brown. The most common full name is John Smith. |
In Exercises 19 and 20, suppose that the file MASTER.TXT contains the names and phone numbers of all members of an organization, where the records are ordered by name.
19. | Suppose that the ordered file MOVED.TXT contains the names and new phone numbers of all members who have changed their phone numbers. Write a program to update the file MASTER.TXT. |
20. | Suppose that the ordered file QUIT.TXT contains the names of all members who have left the organization. Write a program to update the file MASTER.TXT. |
1. | Execution proceeds beyond the first Do loop only when the Peek method returns for one of the input files. Because each of the last two Do loops executes only if the Peek method does not return at most one loop can execute. Neither of the last two loops will be executed if each input file is empty or if the last entries of the files are the same. |
2. | Change the Select Case block to the following: Select Case num1 Case Is <= num2 sw.WriteLine(num1) have1data = Get1data(num1, sr1) Case Is > num2 sw.WriteLine(num2) have2data = Get2data(num2, sr2) End Select |