| || |
Databases are stores of data. That's it. We won't dissect it further because a database can be any store of data, from a simple text file to a complex product inventory system spanning multiple computers. Databases are among the most important aspects of computing, which explains their importance to Web development.
Interacting with databases is one of the most useful things you can do with ASP.NET. Microsoft recognizes this and has made it very easy for you to interact with databases using ASP.NET.
Many commercial database applications are available today, including Microsoft Access, Oracle, Informix, and DB2. These all provide different mechanisms for storing data, but they all share common database concepts.
| || |
In database terminology, a table is a two-dimensional representation of data. Again, databases come in a wide variety of shapes and sizes, so tables can be nearly anything. Figure 8.1 shows a simple database table created in Microsoft Notepad.
Figure 8.1. A database table can come in many forms, including a simple text file created in Notepad.
The table in Figure 8.1 shows two columns: Cars, which contains the model names; and Manufacturers, which contains the names of the manufacturers. A column, therefore, describes one property of information. A row contains actual data from one or more columns it describes one item in detail, and is often called a data record. A column in a row is called a field. For example, Figure 8.1 has 11 rows (or records), and each row has two fields: Cars and Manufacturers.
It may be helpful to think of a field as the intersection of a row and a column.
Each record is a unique item, and you can have as many fields to describe that item as you wish. For example, you could add columns for the number of miles, the year, and the last oil change.
Figure 8.1 shows a database consisting of a single table. However, databases usually consist of many tables, each of which is a type of object. Multiple tables help you to relate information more logically and efficiently. For example, let's assume you want to store the following information about cars and manufacturers in your table:
Although you could store all this information in a single table, you'd be repeating a lot of information over and over again. For instance, Geo only has one address and phone number, so why repeat that information for every car, as shown in Figure 8.2?
Figure 8.2. Redundant information in a table.
| || |
Typically, this type of data storage is called a flat-file database, in which information corresponding to multiple tables is stored in one giant file. A lot of information is repeated, taking up a lot of space and time. Flat-file databases are also error-prone. Imagine having to enter the same address over and over again manually. Chances are, you'll make a mistake. A much more efficient way to store this data would be in two separate tables, as shown in Figure 8.3.
Figure 8.3. Moving redundant information into a separate table.
| || |
In this manner, the Cars table now only has to keep track of the manufacturer's name. The name field is then related to the name field in the Manufacturers table, which stores all pertinent information about the manufacturers. This type of database, in which tables are linked by relationships, is called a relational database.
| || |
Relational databases offer many benefits over flat-file databases, including more efficient storage, faster searching, and more logical grouping. This book won't go into detailed design methods for relational databases, but there's a whole study of these techniques called normalization that dictates how to properly separate database tables into distinct objects. In general, if you see data that's repeated more than a few times in a table, it may be helpful to move it into its own table and create a relationship between the two tables.
For more information on database design, check out Sams Teach Yourself Database Design in 24 Hours.
Keys and Constraints
| || |
As discussed previously, it's important to keep repetitive information in a database to a minimum. Most importantly, you don't want to have duplicate records because that's the ultimate waste of space. Therefore, primary keys have been introduced to prevent such an occurrence. Let's assume that you have a database of family members, with the columns name, address, and birthdate. You could declare the name column as the primary key, and then no two records could have the same name.
This is very helpful for keeping your database under control, but it's not a very realistic solution because many people can have the same names. As you'll see later today in "Creating Databases," you can also define primary keys over multiple columns as well. In this case, if the primary key included the name, address, and birthdate columns, the database would disallow entering records in which all three columns contain the same information as another record in the table.
| || |
This is more helpful than the previous example, but it's still not foolproof. Perhaps you discover a long-lost second cousin with the same name and birth date as your sister, who just happens to move in with you. This would give you two records in your database with the same name, address, and birth date. An even better way to ensure that your database contains no duplicate records is to introduce another column that has nothing to do with your other columns, and for which you can guarantee uniqueness. Such a column is called an identity column because it can be used to uniquely and accurately identify a record in your database. The most common identity column is simply an incrementing number every new record would get a number that's one higher than the record before.
| || |
Now that you've ensured that each record in your table is unique, you need to create a link to another table. You do this by building a relationship between columns in different tables. Each column should contain the same information. In Figure 8.3, you can see that the tables are linked by the Manufacturer column in the Cars table, and by the Name column in the Manufacturers table. Although the columns have different names, they both contain the same information the manufacturers' names. Columns that are used to link two tables are called foreign keys.
The Cars table has a foreign key to the Manufacturers table, and it's a many-to-one relationship. This means that there can be many records in the first table that correlate to one record in the second table. For example, there are many cars made by Geo, but there's only one company named Geo. Similarly, there can also be one-to-many and one-to-one relationships.
Database Communication Standards
| || |
There are many different database systems out there, such as Microsoft Access, Oracle, and Informix. If each one required a unique protocol to access the database information, developers would have a difficult time creating data-driven applications that weren't specifically tied to a particular database system. Fortunately, most commercial database systems adhere to the Open DataBase Connectivity standard, or ODBC. This standard was developed by Microsoft to provide a common interface for databases so that developers don't have to write different code to access different databases. Databases that adhere to this standard are ODBC-compliant.
Unfortunately, ODBC has its limitations. Developers have criticized it for being difficult to work with because it requires low-level application code that varies depending on the database being used.
In response to this, Microsoft next developed OLE DB, a Windows COM-based data access object that provides access to all types of different data. OLE DB is now the standard for database connectivity. It includes ODBC as well, so virtually every commercial database is supported.
When Should I Use a Database?
Databases are an efficient way to store large amounts of persistent data. Managing such information comes at a performance cost, though. Although databases are extremely efficient at storing large amounts of complex, interrelated data, you may want to simply use a cookie or session variable if you only want to store temporary or miniscule amounts of information such as a site visitor's name or e-mail.
Databases are the optimal choice when you need to store information that forms complex relationships between different types of data. Another advantage of databases is that they provide a central storage location for large amounts of data. Most large software applications that require data storage use a database of some sort.
Working with databases through an ASP.NET Web page is an easy and natural process. In a little while, you'll see how easy it is to access any database from your ASP.NET pages.