Recipe3.5.Importing Data from Web Pages


Recipe 3.5. Importing Data from Web Pages

Problem

You've found a web page containing useful data that you'd like to use in a spreadsheet; however, you don't feel like retyping it all in.

Solution

There are a few ways to get the data from a web page into Excel without having to retype it all. You could try copying and pasting, but this may give you format problems. You could also try saving the web page as a text file, but the formatting of the web page may make this cumbersome.

The most direct way to import data from a web page uses Excel's built-in capability to import data from tables and preformatted areas contained in web pages . From within Excel, select Data images/U2192.jpg border=0> Import External Data images/U2192.jpg border=0> New Web Query....

Discussion

Upon selecting Data Import External Data images/U2192.jpg border=0> New Web Query... from the main menu bar, youll see the New Web Query window shown in Figure 3-11. The New Web Query window is essentially a web browser. You can type in a URL in the address bar to go to the page containing the data you'd like to import. You can also navigate using the forward and back buttons (the circular ones with arrows, to the right of the Go button in the toolbar).

A key difference between this window and your usual web browser is that this window analyzes the web page to determine which data you can import. Each block of data that can be imported is marked with a little arrow icon in the block's upper-left corner, as shown in Figure 3-11. If you drag you mouse over one of these icons, the corresponding data block will be highlighted. Click the icon to select that block of data (the arrow will turn into a checkbox). Click the arrow icon in the upper-left corner of the page to import the entire page. Once you've selected the blocks of data you'd like to import, press the Import button.

The results of this action for the page displayed in Figure 3-11 are shown in Figure 3-12. Notice that the External Data toolbar is shown when the web query is completed, just as in the earlier example showing how to import data from an Access database. You can use the External Data toolbar to manipulate the web query. For example, if you press the refresh button, Excel will attempt to refresh the data obtained from the web page by reaccessing the page over the Internet. This can be useful if the data you've imported changes frequently.

Once you've imported the data, you can use it in your own analyses or manipulate it further using other techniques discussed in this chapter.

Figure 3-11. New Web Query window


See Also

You can save your web queries and rerun them at later times from other workbooks if you'd like. This is useful if you want to use the data in more than one workbook. You can save the query by pressing the save icon in the New Web Query window (see Figure 3-11). For more information see the help topic "Query for data from a Web page" in Excel's online help.


Recipe 3.6. Parsing Data

Problem

You've imported data from a text file as discussed in Recipe 3.3, but all data on each row is lumped into a single cell. You'd like to separate the data on each row into columns.

Solution

Select the data you'd like to parse. Then select Data "Text to Columns... " from the main menu bar to open the "Convert Text to Columns Wizard.

Figure 3-12. Data imported using web query


Discussion

The data shown in Figure 3-7 (in Recipe 3.3) was imported from a text file via drag-and-drop. The problem with this approach is that all the data on each row is lumped into a single cell. This is less of a problem with the import method in Recipe 3.1, because using File images/U2192.jpg border=0> Open... automatically launches the Text Import Wizard, which allows you to set columns for your data either manually or by specifying delimiters.

The "Convert Text to Columns" Wizard, accessible via Data "Text to Columns ...," is essentially the Text Import Wizard. The three steps in the "Convert Text to Columns Wizard are identical to those in the Text Import Wizard. First you specify whether the text to be parsed is in fixed-width form or delimited. Then, depending on your selection in step 1, you either manually set column breaks or select the delimiting character. Finally, you can specify data formats and choose which columns to ignore.

Performing these steps on the data shown in Figure 3-7 yields the parsed data shown in Figure 3-13.

Figure 3-13. Parsed data