Chapter 22: Excel and the Structured Query Language | Microsoft Office Excel 2003 Programming Inside Out (Inside Out (Microsoft))

< Day Day Up >

One of the many useful features of Microsoft Excel is its ability to access information in databases. You can copy information from a database table to a worksheet or from a worksheet to a database table. You can even combine information from a Microsoft Access database or a Microsoft SQL Server database with a user form.

This chapter serves as a brief introduction to using databases such as Microsoft Access and Microsoft SQL Server with Microsoft Excel. Because designing an effective database is beyond the scope of this book, design is not discussed at all. But there are a number of good books that cover Microsoft Visual Basic and database programming. However, in the context of programming Excel 2003, you should look for a book that covers database programming for Visual Basic 6, not for Visual Basic .NET. Visual Basic .NET is not compatible with the Visual Basic for Applications (VBA) facilities included with Excel. On the other hand, Visual Basic 6 is fairly close to VBA, and it uses the same tools (specifically ADO) to access a database. So the database programming techniques for Visual Basic 6 can be carried over to Excel and VBA without change.

Comparing Spreadsheets and Databases

In many respects, a worksheet, consisting of a series of rows and columns, is very much like a table in a relational database. This similarity can make a worksheet a powerful tool for manipulating database information.

Fundamental Database Concepts

A relational database consists of a series of tables. Each table roughly corresponds to a single worksheet. A table is structured as a series of rows and columns of data, as shown in Figure 22-1.

click to expand
Figure 22-1: A database table is constructed as a series of rows and columns.

A database table doesn't rely on row numbers, like a worksheet. Also, columns don't have arbitrary column names such as A, B, or C. Instead, tables rely on set theory, which means that you deal with groups of rows, not the individual columns. A set of rows extracted from one or more tables can be thought of as a virtual table or a view. The tables used to create a view are known as base tables. A view can be created from one or more base tables or other views in the database. Views are important because many operations in a relational database create or return virtual tables.

Unlike worksheets, whose cells can store any type of data, relational databases require that each column must have a single data type. Typically, this data type can be a binary number, a string of characters, a Boolean value, a date/time value, or an unformatted binary value. In addition to having a data type associated with each column, a column in a particular row may contain a Null value. Null means that a value has not been assigned to the column. Notice that an empty string isn't the same thing as a Null value; an empty string is a string that contains zero characters, whereas Null simply means that the column doesn't have a value.

Note

It's possible to create tables where Null values are not acceptable for a particular column or columns. See your database documentation for more information about how Null values are handled.

Because of the rigid structure imposed on tables, data is typically separated into multiple tables, with each table containing information about a single entity. Thus, you might have one table containing information about customers, another containing information about products, and a third table containing information about the orders the customers place. Fortunately, you can manipulate this complex arrangement of data using a language called the Structured Query Language (SQL). SQL statements make it possible to retrieve and update information stored in tables.

Note

Depending on whom you talk to, you'll hear the acronym SQL pronounced either as the word sequel or as a series of letters (S-Q-L). Both pronunciations are correct, but the first came about because the Structured Query Language used to be known as the Structured English Query Language.

Inside Out
Database Design and Administration

Designing and using databases can be a complex task. Most organizations have one or more individuals that are responsible for the database system. This is especially true if you are using a high-performance database such as Microsoft SQL Server, Oracle, or DB2. If you are not sure how these databases work in your organization, look for the individual called the database administrator. The database administrator should be able to answer many questions that apply to your specific environment.

In situations where you are dealing with simple database systems such as Access, you might not have a database administrator. In these cases, you might want to talk to the person who wrote the programs for the database. This individual should be able to provide you information, such as connection strings, that you need to open the database from Excel.

If you want to design your own database, keep in mind the following: KISS (Keep It Simple Stupid) and SMILE (Simple Makes It Lots Easier). If you have only a few hundred pieces of data, consider keeping all the data in a set of Excel worksheets. You might find it easier to maintain than building a real database.

If you really believe that you need a database, take some time and read some books that provide a good introduction to databases such as Microsoft Office Access Inside Out by John L. Viescas (Microsoft Press, 2003). Just keep in mind that it's very easy to design a database that's so complex to use that it really doesn't address your original needs.

Database Keys

A database key is a way to identify a set of rows in a table. The primary key for a table is a way to uniquely identify a particular row in the table. Often the primary key is a single column, such as CustomerId or EmployeeId, which could uniquely identify a row in the Customers or Employees tables. However, it's desirable that the primary key for a table can span multiple columns. In other words, the combination of OrderNumber and ItemNumber uniquely identifies a particular item purchased within an order.

A secondary key is different from the primary key in that it need not uniquely identify a single row. In fact, secondary keys are useful for locating groups of rows that have something in common, such as all the customers in a particular city or all the orders that were placed on the same date.

A foreign key is merely a secondary key in one table that is also the primary key in another table. Foreign keys are useful for identifying a group of data that is related, such as the collection of orders that was placed by a particular customer or the item numbers that comprise a single order.

Tip

Indexes Save Time
Primary keys and foreign keys are often used to optimize database access. Database designers use the keys to create indexes on the tables, which can significantly speed up access to a particular set of rows. Whenever possible, you should use indexes as part of the search criteria when retrieving data from a database.

Some databases, including Microsoft Access, allow you to define an identity column. An identity column (called an AutoNumber column in Access) automatically generates a new value when a row is inserted into the table. This type of column is extremely useful as the primary key for a table because the database guarantees that the value is always unique.

Accessing Databases from Excel

Databases are typically organized around the client/server concept. (See Figure 22-2.) This model assumes that the database server is separate from the database program that accesses the server. A database program uses a particular application programming interface (API), which in turn, communicates with the database server.

Figure 22-2: Databases are typically organized as a database client talking to a database server.

Excel relies on an API called Active Data Objects (ADO) to access databases. ADO is a standard component in Microsoft Windows, so it's always available to your program. Using ADO, you can connect to a variety of different databases, including Access, SQL Server, Oracle, and others.

Note

Although Access doesn't rely on the traditional client/server model, you still use the ADO interface to connect to an Access database.

Once you've established a connection with a database server with ADO, you can execute SQL statements to return data to your application or to make changes to the data stored in a table.

< Day Day Up >