11.2 Data Management Within Excel


11.2 Data Management Within Excel

Since many of Excel's data management functions are unknown even to advanced Excel users, this section describes such commands primarily from the point of view of the user . Some brief information on running these functions via VBA code follows in the next section.

Pointer  

Please note that the commands described here can generally be used in any Excel worksheet. Excel does not distinguish in any systematic way between a worksheet table and a database. Rather, it interprets every associated range of cells as a database.

Creating a Database in Excel

The database concept in Excel is irresistibly simple: Any table or list can be interpreted as a database. The only condition is that each column must contain information of a single type (such as all addresses in column C, telephone numbers in column D). From this it follows that a unit of associated information (say, that consisting of a name , address, and telephone number) must reside in a row. This informational unit is frequently called a "data record" or simply "record."

In practice, a database table usually is distinguished by its first row, which contains titles of the various columns . It is a good idea to leave that part of the worksheet below the data entries empty, so that the database can be enlarged as needed. The concept of a workbook in Excel lends itself to placing each database in its own worksheet (and thus clearly separated from other data in a workbook), though this is by no means a requirement. In theory, you can place several databases in a worksheet next to each other or even on top of one another. You can even select an arbitrary range of cells and apply database commands to it (such as Data Sort ).

Installing a New Database

Creating a new database within Excel is no different from setting up an ordinary worksheet table. You input some data (addresses, say) in the usual way and label the columns. With larger data sets it is useful to divide the window and to fix it so that the row of column labels is always visible.

If some columns have cells containing extensive information, then you should activate the attribute "Wrap text" in FormatCellsAlignment. Excel can then split long entries as needed over several rows (within a cell ). Excel will increase the row height automatically. As a rule, this makes the table easier to read than having columns that are too wide. At the same time, the entire table should be aligned vertically from the top (in the same command as above), so that all entries in unusually tall rows begin at the top (and not at the bottom).

Tip  

In formatting of cells it is generally a good idea to format entire columns (click on the column head) or even the entire worksheet (click on the corner between the column and row heads) and not individual cells. First of all, from Excel's point of view this is more efficient, and secondly, this formatting is independent of the height of the table; that is, it automatically applies when the table is extended.

An Employee Database

As an example database, the file Staff.xls is available to the staff of a firm, members of a club, or students and faculty of a school, or whatever. In the following paragraphs are presented some details of this minidatabase.

In the first column the individual's gender (m/f) appears, and in the second, the form of address (formal, personal). From this information the third column, salutation, is created, which distinguishes between the formal "Dear Mrs. Pachleitner:" and the personal "Dear Hermann,".

The formation of the salutation takes place in the function Salutation . This function takes as parameters the contents of the first four columns. According to the contents of male_female and formal_private the resulting character string is created and returned as the function result.

 ' example file Staff.xls Public Function  Salutation  ( _     male_female As Variant, formal_private As Variant, _     family_name As Variant, first_name As Variant) As String   If Left(male_female, 1) = "m" Then     If LCase(formal_private) = "f" Then       Salutation = "Dear Mr. " & family_name & ":"     Else       Salutation = "Dear " & first_name & ","     End If   Else     If LCase(formal_private) = "f" Then       Salutation = "Dear Mrs. " & family_name & ":"     Else       Salutation = "Dear " & first_name & ","     End If   End If End Function 

The column labeled "Month of Birth" is created by taking information from the column labeled "Date of Birth." Since the month is formatted as a string of three letters , Excel creates an abbreviation for the month. The only purpose for this column is to be used as a filter criterion. (You will find very few database programs that allow you so easily to find everybody with a birthday in a particular month. You should take note of the way that this is done, that is, the creation of a column that functions as a filter criterion. It is characteristic of databases in Excel.)

In the column "Age" an attempt is made to render the current age of the employee as a whole number. For this purpose the date of birth is subtracted from the current date NOW() , and the result is transformed into a number of years with YEAR() . Since the reckoning of time in Excel begins with 1900, the result for a twenty-five-year-old is at this point 1925. Therefore, YEAR(0) is subtracted off. This formula is simple, but it is not quite exact. Due to leap years, it can happen that one day before or after a birthday can lead to an age that is off by one year.

 =YEAR(NOW()-K3)-YEAR(0) 

Finally, the columns " Group A, , GroupD" should be mentioned, in which a numeral "1" is used to indicate that an individual belongs to that particular group. (You could as well use a different symbol, such as an "x". The number 1 has the advantage that it can be used for calculations. It would be possible, for example, to create a column of sums that would give the number of groups to which each individual belonged.

Pointer  

The database Staff.xls also possesses the buttons Prepare Microsoft Word Mail Merge and Autofilter on/off. The associated macros and their functions appear below, in the next section. Background information on calculations with date and time can be found way back in Chapter 5.

Input, Alteration, and Deletion of Data with the Database Form

In general, databases can be edited directly in the worksheet; that is, data can be altered , deleted, and newly input. An alternative to this, however, is the possibility with the command DataForm to display a so-called database form (Figure 11-2). In this form is displayed exactly one data record. This data record can be edited or deleted. Database columns containing formulas (rather than input values) cannot be changed in the database form (in Figure 11-1, for example, the birth month, which is provided by the birth date). Accidental changes can be undone with the button Restore. The function EditUndo is not available after the termination of the database form.

click to expand
Figure 11-1: An employee database
click to expand
Figure 11-2: The Excel database form

With the button New you can input a new data record. New data records are always inserted at the end of the database. In this case formatting and formulas are taken from the last row of the database. The database is not automatically sorted.

The vertical scrollbar in the database form allows rapid transit through the data records to the one you are looking for. The order in which the records are displayed corresponds to the order of the records in the table ”if the data should appear sorted according to some criterion, you must do the sorting before invoking the database form.

The button Criteria in the database form leads to an alternative display of the database form, in which you can provide criteria for the individual fields. Then with Find Prev or Find Next you can find the next data record (from the current one) that fulfills these criteria. Possible search criteria are " A* " (all names that begin with "A" ) or " >100 " (numbers greater than 100). The syntax for search criteria is treated further on, when filtering of data is discussed.

Working with the database form has both advantages and drawbacks in comparison to working directly in the table. First we discuss the advantages:

  • In the database form all data of a data record are displayed in compact form. Database tables are often so wide that you can see only a piece.

  • Use of the database form requires no special knowledge of Excel (other than minimal Windows savvy). Database forms are therefore especially suited for input of data by Excel novices.

  • It is almost impossible to wreak large-scale havoc among the data. Particular care is required only in the input of new data: The button New must be clicked before data input begins. Otherwise, Excel interprets the data not as a new data record, but as a correction to the current data record. (There is also the possibility to edit, via a database form, a database whose window is not visible on the monitor.)

The predefined Excel database form is very convenient to use. However, it has its drawbacks:

  • Filter criteria placed in the table (see below) are not taken into account.

  • Numerous database commands ”such as sorting ”cannot be executed unless the form is closed.

  • The structure of the form is fixed. The labeling of the data fields corresponds to the labels on the columns of the database, and the same amount of space is just as big (or as small) for one as for another. It is impossible to set up the form so that only selected data fields can be altered.

  • An automatic plausibility control over the input (such as whether a valid date has been entered in the birth date field) cannot be instituted.

  • A simultaneous processing of several data records (such as to delete all obsolete records) is impossible.

  • When the database form is exited, no value is returned to VBA code giving information about the last data record to have been displayed.

Sorting, Searching, and Filtering Data

Sorting Data

One of the most important database commands is DATASORT. This command leads to the dialog displayed in Figure 11-3. There you can input up to three columns of your database according to which you would like the data to be sorted. The input of several sort criteria can be useful, for example, for printing a multiple mailing in which you can save postage costs by sorting first by state and within state by ZIP code.

The sort command sorts, as a rule, in the range in which the cell pointer is located. However, the command can also be used to sort cells that have been explicitly selected, independently of whether that particular range of cells is related to a database. In this connection the meaning of the two option buttons "Header row" and "No header row" becomes clear. Normally, in the sorting of a database, Excel selects the entire database including the header row. But the header row is, of course, not sorted along with the database. On the other hand, if a selected range of cells is to be sorted, then such a header usually does not have a header row, and then the option "No header row" must be selected.

click to expand
Figure 11-3: Dialog for sorting a database

The button Options allows for greater control in the execution of the sort command. First, one may determine whether Excel should distinguish between uppercase and lowercase (which normally, it does not). Second, special sort orders (such as day of the week or month) can be set. The sort lists presented correspond to the lists that can be set with OptionsExtrasCustom Lists. And finally, you can sort columns instead of rows. A sort by column appears seldom in practice. In database applications it is difficult to imagine a situation in which such a sort would make sense.

The sort command can be undone with an immediate appeal to EditUndo after it has been executed. Nonetheless, the command should be used with great care. It can destroy the construction of a carefully crafted table.

Particularly critical is the sorting of tables whose formulas relate to cells outside of the data record or tables to whose cells reference is made in other formulas. Upon sorting, Excel changes all relative references that refer within the sorted range to cells outside of the data record. Depending on the construction of the table this can make sense, or it could spell disaster! After executing the sort command check the contents of the table. If the result does not correspond to your expectations, undo the sort command and alter the table's construction by replacing relative references with absolute references (a painful task that cannot be automated) or by replacing formulas by their values (EditCopy, EditPaste SpecialValues).

Tip  

When the command DataSort is executed while the cell pointer is located in a pivot table, the data fields of the pivot table will be sorted anew. The position of the cell pointer specifies which data are used as the sorting criterion.

Tracking Down Data

The command EditFind is actually not a database command, but it can be gainfully employed in databases. The command leads to the dialog exhibited in Figure 11-4. If no range of cells is selected before the command is executed, Excel searches for the search expression in the entire active worksheet. The search is not restricted to the current database. The search phrase can contain either or both of the wildcards * (for an arbitrary number of characters ) and ? (for exactly one arbitrary character), but conditions like "> 3" are not allowed.

click to expand
Figure 11-4: Dialog for searching the data

Let us now examine the options of the Find dialog, not all of which are equally easy to understand. With the "Search" box you can determine whether the data are seached by row or by column. In particular, in the case of large tables this option can have considerable influence on the search time. (This holds particularly true for macro programming if the command is to be employed frequently.) The search begins in the currently active cell.

The selection field Look In determines the source of the data: The standard setting "Formulas" means that the search text should be looked for only within a formula (such as the search text "3" in "=A3" or in "=A5+3" ). The setting Values

looks only at results derived from a formula (thus the data displayed in the cell). With the setting COMMENTS you can search for information in the comments attached to cells.

The option Match Case is clear. With "Find entire cells only" is meant that the search phrase must match the entire content of a cell, and not just a part thereof.

Data Filters (Search Criteria)

By filtering data we mean that only those data records of a database are displayed that meet certain criteria. By setting up a filter you can have displayed from among an extensive database only those records that you wish to edit (complete, delete, copy, etc.).

Autofilter

The simplest form of filter is offered by the command DataFilterAutofilter. It results in the placement of small list selection arrows in the lower right-hand corner of the header cells. Clicking on one of these arrows opens up a list that offers, in addition to all the entries in the particular column of the database, the following settings:

"xyz":

Indicates the records whose data field matches the dummy character string "xyz" exactly.

"(All)":

Displays all data records.

"(Blanks)":

Indicates data records whose field in the current column is empty.

"(NonBlanks)":

Indicates data records whose field in the current column is not empty.

"(Custom)":

Indicates those data records that correspond to the selected criterion in the dialog box (see Figure 11-3).

"(Top 10)":

Indicates those records with the largest or smallest values. It does not necessarily have to be ten records. You can specify the number in an options field. The top-10 variant was new in Excel 7.

As soon as a filter has been selected, Excel displays only those rows of the table whose records match the criterion. The selected filter buttom and the column head are displayed in blue, to indicate that for the moment not all the data are visible.

Several filters can be combined with a logical AND, for example, for selecting all records of the employee database that represent employees belonging to both of groups A and B. (The combination of filter criteria with a logical OR ”all employees , say, in Group A or B ”is possible only with special filters; see below.)

If data are changed after a filter has been set, this has no influence over whether that particular record will be shown. The decision as to which rows are visible occurs only once, namely, when the filter is set.

The command DataFilterShow All restores all of the filters to "All," thus leading to the unfiltered display of all the data records.

Custom Autofilter

With a custom autofilter it is possible to assemble somewhat more complex search criteria. Figure 11-5 shows that a criterion can be assembled from two conditions, which can be joined by a logical AND or OR. However, this combination affects only the conditions for the criterion of the current column, not for several filters in a database. As single conditions one is allowed patterns that are determined by wildcards (such as "*in*ton" for "Clinton" or "Washington") as well as comparisons (such as ">=100" or "<F").

click to expand
Figure 11-5: Dialog for setting a custom autofilter

Examples of Autofilters

With autofilters you can create the following groups from the database Staff.xls :

  • All employees in Group C

  • All female employees in Group C

  • All employees who live in Tuscaloosa

  • All employees who live outside of the Orkney Islands

  • All employees whose birthday is in February (birth month = Feb) and are at least fifty years old (age > 50 or, better, date of birth < 3/1/1953, say)

Advanced Filters

In addition to the user-friendly autofilters, Excel offers so-called advanced filters. The invocation of DATAFILTERADVANCED FILTER opens up the dialog pictured in Figure 11-6. The effect of an advanced filter is in principle the same as that of an autofilter: Only the data records satisfying the criteria are displayed. (The copying of filtered data is considered a few paragraphs below.) The effect of an advanced filter can be undone, as with an autofilter, with DataFilterShow All.

click to expand
Figure 11-6: Dialog for setting an advanced filter

The decisive characteristic of advanced filters is that the filter criteria must be formulated in a specific range of cells. This range consists of a header row and one or more criterion rows. In the header row are entered all the column headers that are relevant to the search criteria. The headers must match exactly those in the database (including possible blank spaces). The criteria cells contain conditions for the database columns. Several conditions in a row are joined with "and" and must be satisfied simultaneously . When several rows contain criteria, these are joined with "or."

One is allowed to have as conditions, as in the case of autofilters, comparisons such as " >10 ", " <=5 ", or " >=A ", as well as pattern strings such as " =M*r ". Note that character comparisons that Excel might interpret as a formula are prefixed by the single-quote character. Otherwise, Excel produces the error message #Name , because the character string does not represent a valid cell name.

For many conditions it is required that a column of the database appear twice in the search criteria, such as for all employees whose age is ">=20" and "<=30."

Figure 11-7 shows some examples of possible filter criteria. Please note the difference between OR compounds (in the first example) and AND compounds (in the second and third examples). The criteria in Figure 11-7 are attractively presented with a border and boldface type, but this format serves only better readability and is not a requirement. On the other hand, you must be sure that the column head agrees exactly with that in the database.

click to expand
Figure 11-7: Some search criteria for the employee database

In the example file Staff.xls the filter criteria have been housed in their own worksheet. This has the advantage that it can be displayed in its own window irrespective of the visible portion of the database table. The ranges of cells with the criteria were given the names "Filter1" through "Filter4" so that the input of a critierion with DataFilterAdvanced Filter can be conveniently carried out: You can set the cell pointer anywhere in the database, execute the above command, and specify "Filter1" through "Filter4" as the criteria range. For this the filter criteria do not have to be visible on the monitor. To try out advanced filters in Staff.xls , set the cursor in a cell of the staff list (sheet "database"). Then execute DataFilterAdvanced Filter. Excel automatically selects the entire database as List Range. You have only to type in "filter1" through "filter4" in the Criteria Range field and click OK.

Pointer  

The database functions introduced in the next section must likewise have criteria specified for them. Here, too, the criteria are formulated in a range of cells. The syntax is the same as with filter criteria.

Copying and Deleting Advanced Filters

If you wish to copy filtered data to another location in the table, you must select the option Copy To Another Location in the Advanced Filter dialog and specify an output range. The output range must be headed (that is, with column heads). At first glance this may seem like unnecessary additional labor, but in fact, it makes possible the selective copying of particular columns of the database. Much more burdensome is the restriction that the output range must be located in the currently active worksheet. It would often be worthwhile (and more readable) to copy the filtered data into another worksheet. In fact, that is possible, but DataFilterAdvanced Filter must be executed while another worksheet is active.

In the example file an output range is defined, beginning with cell T23, that contains only the three columns giving first name, last name, and telephone number. If you would like to try out the copy command, place the cell pointer in the database, select the command DataFilterAdvanced Filter, and in the dialog activate the copy mode. Then give a criteria range of "Filter1" as above (but you must also check the option Copy To Another Location and insert this location in the Copy To field) andx a "Copy to" range of T23:V23. (See Figure 11-8.)

click to expand
Figure 11-8: The result of copying with an advanced filter

Unfortunately, there is no analogous copy command for autofilters. If you wish to copy selected data records via an autofilter, you must manually select and copy them. Selectively copying only certain columns is achieved only with extra effort ”by hiding columns in the database.




Definitive Guide to Excel VBA
Linux Application Development For The Enterprise (Charles River Media Programming)
ISBN: 1584502533
EAN: 2147483647
Year: 2002
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net