Data Sources | Beginning ASP.NET 2.0 and Databases (Wrox Beginning Guides)

Chapter 1 - Displaying Data on the Web

byJohn Kauffman, Fabio Claudio Ferracchiatiet al.?
Wrox Press ?2002

So: you've already considered some or all of the issues in the above lists, and you're still with us, which means that it's reasonable to assume you want to write a data-driven web application. The first question that needs to be answered, then, is where the information that will eventually end up on the user's screen is going to come from. Depending on factors such as the type of data, what operations are to be performed on the data, and the amount of use that is going to be made of the system, there are a multitude of options available. This section describes the reasons for and against using three of the most common data source types, along with an overview of the other types available.

Databases

When you start thinking about data sources, the most obvious one that springs to mind is the database, which will generally provide the most reliable, scaleable, and secure option for data storage. When you're dealing with large amounts of data, databases also offer the best performance. However, the very fact that other solutions exist is a sure indication that in some circumstances, they're not the best choice.

In general, databases are designed to store large amounts of data in a manner that allows arbitrary quantities of data to be retrieved in arbitrary order. For small collections of data, such as a set of contact details, the time and other costs involved in creating and accessing a database might outweigh the benefits that databases provide.

We'll have much more to say about the structure of databases in the next chapter, but as a quick example, wanting to store some information about a company employee in a database might move us to create a table called Employee that can contain the same pieces of data about a number of employees. Such information could include their EmployeeID (number), LastName, FirstName, BirthDate, and Country:

Throughout this chapter, comparisons and demonstrations will be made of how data can be stored and represented. For consistency, the same example is used throughout: that of storing details about the employees in an organization.

One thing to note when we display a database diagram, compared to the diagrams of other data sources, is that it's based on a model of the information being stored, rather than examples of the data. The way in which databases actually hold information is largely hidden from the outside world, leaving us to depict concepts rather than actual data items.

Text Files

At the opposite end of the scale from using databases to store information for a web site is the use of text files. Although text files can store information in almost any conceivable format, they are generally used for storing a set of data, one item on each line. If we were to capture the employee information detailed above, we could store the LastName, FirstName, BirthDate, and Country of two employees in a text file as follows:

    Smith, John, 05-04-1979, UK    Bloggs, Joe, 29-09-1981, US

For simple information such as this, a text file provides an easy way of reading and writing data. If the data to be stored has more structure, however, it becomes far more time consuming. For example, it could be the case that each of these employees has placed an order for some office supplies. Rather than adding all of that information to the text file as well, it would be better to hold it separately, and then define relationships between the two sets of data.

When the data starts to gain 'structure' in this manner, a method of giving the file itself some structure must be found, and a way of retrieving it and representing it in memory must also implemented. One way of doing this is through the use of XML.

XML

In some ways, XML documents can be thought of as a stepping-stone between text files and databases; they store data using text files, but use a hierarchical and relational format that is both extensible and self-describing, providing a number of the benefits of a database system. Before we go any further in explaining the use of XML as a data source, a sample fragment of an XML document is shown below:

    <company>      <employees>        <employee LastName="Smith" FirstName="John"                  BirthDate="05-04-1979" Country="UK" />        <employee LastName="Bloggs" FirstName="Joe"                  BirthDate ="29-09-1981" Country="US" />      </employees>    </company>

As you can see, the same information is being stored as in the text file, but there's also an indication of the nature of that information. You know that 29-09-1981 is the BirthDate of Joe Bloggs, because the data says so. Another benefit of XML is that it can contain multiple types of information in one document; a fragment like the one below could be inserted after <employees>:

    <orders>      <order >        <product>Staples</product>        <product>Pencils</product>      </order>      <order >        <product>Biros</product>        <product>Erasers</product>      <order>    </orders>

Using the comprehensive functionality that's built into the XML-handling support provided by the .NET Framework (and other platforms), retrieving and manipulating the orders separately from the employees can be accomplished quite easily. This makes it possible to specify an order from the list for each employee by storing the ID of each order as part of the employee's details:

    <employee LastName="Smith" FirstName="John"              BirthDate="05-04-79" Country="UK" Order="2" />

XML is a powerful way of representing information, but in some circumstances performance can be a problem: updating and retrieving data from XML can be a time-consuming process. This is rarely an issue when a few users are accessing a small amount of data, but if there's a lot of data (or a lot of users) it can sometimes become one.

Other Sources

Between them, the three options enumerated above cover the main categories of data store, but there are many others that either fall between these, or follow a completely different paradigm. Most of the types that we haven't covered, though, are domain-specific - that is, that they've been developed to suit a specific task. On the Windows platform, typical examples of these include:

Microsoft Exchange Server - the data store containing e-mail, calendar, and contact information
Active Directory - the information that's stored by Windows-based servers regarding the users on the system, their permissions, etc.
Spreadsheets - applications such as Excel store their data in a tabular format, a grid containing values that are used in tasks such as financial calculations.

In summary, although this book is focusing on databases (and uses them in the majority of its examples), it is important to remember that databases are not the only kind of data store, and that other mechanisms for storing data can often achieve the same goal more efficiently.