Section 10.1. Text Files


10.1. Text Files

Text files are the simplest kind of data to import. These files are common and can come from any kind of system. In some cases older mainframe systems have produced a file for years and an Excel application uses the file as input. When the time comes to rewrite the mainframe application it is not difficult for a modern server to create the same file. It is common to find an old data-sharing relationship based on text files even with today's technology. The advantages of this data-sharing scheme are simplicity, size, and stability. Excel understands two kinds of text files.

10.1.1. Fixed Length Files

In a fixed length file, every record or row has the same number of characters. Each data element, such as a name or phone number, has a fixed number of characters. If a name field is set up with a length of 25 characters and a name comes in that is too long, the extra characters are lost. If the name is less than 25 characters long, blanks are inserted to make up the difference. This kind of data storage is typical of older systems and dates from the era of punch cards and accounting machines. You are not likely to see a new process being built with this kind of file, but there are plenty of older systems that still use them.

In this chapter our sample data is stock prices. The data includes stock symbol, date, open, high, low, close, and volume. A fixed length version of this data is in a file named ch10_FixedLen.txt and is located on my C drive in the My Documents folder.

To import the file, I select Data Import External Data Import Data, then navigate to and select the file. If the filename does not show up in the dialog, I need to be sure the .txt file type is selected as shown in Figure 10-1.

Figure 10-1. Selecting the text file type


Once the file is selected, the text import wizard starts and the dialog in Figure 10-2 is displayed.

This is a fixed length file and the Fixed Width option is selected. If I wanted to skip rows at the top of the file containing heading information, I could indicate the row number to start the import at. The display area at the bottom shows the contents of the file. It is easy to see that the file is fixed length. Clicking Next brings up the dialog in Figure 10-3.

In this step the data is mapped by position. Excel tries to map the file for you and in most cases it will be correct. The lines and arrows can be moved around and changed manually, allowing you to map the file anyway you like. Once the mappings are correct, clicking the Next button brings up Step 3 as in Figure 10-4.

Figure 10-2. Text import - Step 1


Figure 10-3. Text import - Step 2


Here you can select a data type (general, text, or date) for each data item if necessary. In most cases Excel will get it right and you won't have to do anything. You can also tell Excel not to import some of the data items. Select the data items one by one by clicking on the data or heading in the Data preview area. The Advanced button displays the dialog shown in Figure 10-5.

Figure 10-4. Text import - Step 3


Figure 10-5. Controlling the display of numeric data


This dialog gives you control over how numeric information is handled. It defaults to the normal convention for numbers, but allows you to specify a non-traditional numeric format if needed. When all the settings are right, click on Finish (shown in Figure 10-4), and the dialog in Figure 10-6 is displayed.

Here we set the cell to receive the import. In this case we are putting the data on the active sheet, but there is an option to create a new sheet to hold the data.

Figure 10-6. Finishing the text import


Excel's data import tool is flexible and makes working with most text files easy. But, its real power is in the Properties option on this dialog. Clicking the Properties button reveals Figure 10-7.

Figure 10-7. The Properties dialog


The options allow you to control how data is put on the sheet and what happens to the older data that is already there. If you are importing data for analysis or a one-time job, these options may not help much. But if you are building an application that needs to be updated periodically, these features can make the job easier. The application in Chapter 8 monitors a queue. Its data is a snapshot taken at a point in time. As work progresses through the day it needs to refresh its data to stay current. Building the logic to do this is complex, and in Chapter 8 we didn't consider that problem.

Suppose the data for the application is exported by another system as a text file every 10 minutes. We would want the application to import new data every time it is started and every 10 minutes while it is open.

The settings in Figure 10-7 let you do exactly that. In the Refresh section I tell Excel to refresh the data every ten minutes and to refresh on file open. The refresh remembers all the settings in the import and handles resetting the sheet automatically. The application is designed with a data sheet that holds all the data and does nothing else. This makes it easy to control the flow of new data into the application using the import properties.

How cool is that? I don't have to write any code; I don't have to change anything in the application. I just set up the import and refresh options on the data sheet and it's done.

All of the applications in this book use a data sheet like the one in Chapter 8, and all of them could be linked to text files using this technique.

10.1.2. Delimited Files

Delimited files are like fixed length files except the data items are separated by a special character. The most common choices for the special character are comma and tab, but any character can be used.

Delimited files are more modern than fixed length, and are more likely to come from a PC or server than a mainframe. They take up less space and are easier to handle in VBA.

Some text files use a text qualifier to mark the beginning of string data. But files coming from older systems often do not use this convention, and for these files the delimiter can be a problem.

It is critical to select a character that absolutely cannot turn up in the data. If it does, the delimiter occurs too many times on a line and the import process loses track of where it is in the data. The rest of the file will be imported out of place. Comma delimited files that contain names and addresses are susceptible to this problem. If there is any doubt, it is best to use a really unusual character as the delimiter, like | or `.

The import process is the same for delimited files except for Step 2, in which instead of mapping the file you specify the delimiter as shown in Figure 10-8.

Figure 10-8. Telling Excel what delimiter to use


Here I am importing a delimited version of the same stock file. The delimiter is a comma, but the dialog lets me select any character. The rest of the process is the same, including the ability to control data refresh with the properties dialog.



Analyzing Business Data with Excel
Analyzing Business Data with Excel
ISBN: 0596100736
EAN: 2147483647
Year: 2006
Pages: 101
Authors: Gerald Knight

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net