An SQL Primer | Beginning ASP.NET 2.0 and Databases (Wrox Beginning Guides)

Chapter 2 - Relational Databases

byJohn Kauffman, Fabio Claudio Ferracchiatiet al.?
Wrox Press ?2002

Having looked at Codd's Rules that define what a relational database is, we can return to our discussion of SQL, which effectively defines what we can do with a relational database.

In fact, the name "Structured Query Language" is a bit of a misnomer, since a query means a request for data. While SQL's syntax includes statements like SELECT, which certainly is about requesting data, it also includes statements like INSERT (to insert new rows into a table), UPDATE (to modify existing rows in a table), and DELETE (to delete rows from a table). In fact, SQL is composed of two main parts:

DML, or Data Manipulation Language, which is composed of SELECT, INSERT, UPDATE, and DELETE statements
DDL, or Data Definition Language, which includes statements like CREATE, DROP, and ALTER to create, destroy, and modify database objects (like tables and views), and statements like GRANT to grant and revoke user permissions for database objects

We use the DML component of SQL to manipulate data, and the DDL component of SQL to design our databases. In this section, and for the overwhelming majority of this book, our focus will be aimed at working with data, so we will examine DML in a little detail.

SELECT Statements

Let us begin our journey into SQL with the SELECT statement, which is the one we use to retrieve data. The SELECT statement has three basic clauses:

The SELECT clause itself, which contains the list of fields to retrieve
The FROM clause, which specifies the tables from which we will retrieve data
The WHERE clause, which limits the rows to retrieve, based on the values of certain fields

Try It Out - SELECT Statements

Let's write our first SQL query in MS Access. As a preliminary to this example, choose the Insert | Query menu item, and select Design View in the dialog box that appears. Then, choose View | SQL View, and you'll be presented with something like this:

click to expand

In a SELECT statement, only the SELECT clause is mandatory, but in a real-world program you'll rarely see a query without a FROM clause, and probably a WHERE clause too. Let's look at an example of a very basic, yet meaningful query:
```
    SELECT      ShipperID,      CompanyName,      Phone    FROM      Shippers 
```
Type this query into the Query1 box, noticing as you do so that the SELECT clause contains the fields we want to retrieve, separated by commas. The FROM clause specifies that we want to retrieve these fields from Shippers table. Since we don't have a WHERE clause to limit the rows, we expect all of the rows to be retrieved from the Shippers table. Choose Query | Run, and this is what you'll see:
We can limit both the rows and the fields that our query returns by modifying the SELECT clause to include only those fields we want back, and by including a WHERE clause to limit only those rows we want back. Since the ShipperID of Speedy Express is 1, changing the query to the following just returns that company name:
```
    SELECT      CompanyName    FROM      Shippers    WHERE ShipperID = 1 
```
While we haven't yet begun to exploit the power of a relational database, we now have a sample of very simple syntax from which we can build. Suppose the Northwind sales reps have been getting complaints that orders shipped to Caracas, Venezuela have frequently arrived damaged. They would like to know which shipper(s) has been shipping to that region. This is a perfect opportunity to expand our FROM clause to include the Orders table as well, since that contains the "ship to" data, as well as the ShipVia field that links the shipment to the shipper via the ShipperID field in the Shippers table. We can exploit this relationship by typing in the following query:
```
    SELECT      Shippers.CompanyName,      Shippers.Phone    FROM Orders INNER JOIN Shippers      ON Orders.ShipVia = Shippers.ShipperID    WHERE Orders.ShipCity = 'Caracas'    AND Orders.ShipCountry = 'Venezuela' 
```
On running this query, we learn that Federal Shipping is the only shipper that has ever shipped to Caracas, Venezuela, and we can call them to complain:

How It Works

This SELECT query is a little trickier than the ones we've looked at so far, and as such it merits some further explanation. The first change comes in the SELECT clause, where to specify the fields we want to retrieve we've used the syntax <table name>. <field name>, instead of just <field name>. That's because this query references more than one table - if both tables happened to have fields with the same name, Access would have returned an error because it could not tell which field you wanted.

In this case, there was no ambiguity, because the Orders table does not have fields called Phone or CompanyName. However, we still prefer the <table name>.<field name> syntax, because it's clearer to the readers of our code.

With the above in mind, the new features of the WHERE clause are quite easy to explain: we've used the <table name>. <field name> syntax because our query contains more than one table, and we've also added the AND keyword, which allows us to place an extra condition on the rows retrieved.

That leaves the FROM clause, which has a new feature that's particularly relevant to this chapter on relational databases. However, before we can talk about what that INNER JOIN part means - and how it seems to be able to let us select Shippers fields from the Orders table - we need to take a closer look at some of the features of the Shippers table.

The first thing to observe is that the ShipperID field of the Shippers table is a unique index. This guarantees that a value in this field won't occur in the table more than once. A ramification of this is that you can identify a single row in a table through the value of a unique index. Now, in a relational database, a field in one table that has been designated as a foreign key can be related (or linked) to a unique index in a second table that's been designated as primary key - and this forms the basis for the kinds of relationship between tables that we saw in the diagrams earlier in this chapter.

An RDBMS will honor a private key-foreign key relationship by not allowing values in the foreign key that do not exist in the primary key. For example, the Shippers table has only three rows: one row has ShipperID = 1, another has ShipperID = 2, and the other has ShipperID = 3. Now, since ShipVia in the Orders table is the foreign key to the ShipperID primary key, the only values allowed for ShipVia are 1, 2, and 3. If a row in Orders has a ShipVia field equal to 1, then it corresponds to the row in Shippers where ShipperID = 1, and so on. The RDBMS would return an error if we attempted to place in the foreign key field a value not found in the primary key field.

Getting back to our query, the FROM clause exploits the private key-foreign key relationship by joining the two tables Orders and Shippers using the INNER JOIN syntax. This INNER JOIN syntax is the way we 'join' each record in Orders with its corresponding record in the Shippers table to form a new record. The new record contains all of the data from Orders and all of the corresponding records from Shippers. The SELECT clause can then return fields from either of the two tables, and the WHERE clause can filter the rows returned by the query, based on values from either table.

The selected rows from the Shippers table are limited by the results of the INNER JOIN. We can decompose the example to see how the join works:

    SELECT ShipVia    FROM Orders    WHERE ShipCity = 'Caracas'    AND ShipCountry = 'Venezuela'

click to expand

Using the same criteria in the WHERE clause as we did previously, we SELECT records from Orders WHERE ShipCity = 'Caracas' AND ShipCountry = 'Venezuela', but this time we return the ShipVia values from the Orders table. We still get the same two records back from Orders, as we would expect, but this time we examine the values of ShipVia for those records, and we see that the value is 3 in both. Or at least, we would, were it not for the fact that Access is being 'helpful' again by assuming that we'd rather see the company name.

In general, you can add as many tables as you need to a SQL query, depending on the data you want to retrieve. In the chapters to come, you'll have ample opportunity to practice writing SELECT statements, as well as learning to use them in your Visual Basic and ASP.NET applications.

INSERT, UPDATE, and DELETE Statements

We have used the SELECT statement to cover quite a bit of difficult ground. We've looked at primary and foreign keys, at what they mean for relationships, and how you can exploit them with joins. INSERT, UPDATE, and DELETE queries use those same basic concepts to allow us to modify data.

Try It Out - INSERT Statements

Let's begin by inserting a new row into the Shippers table of the Northwind database. Type the following code into the Query1 window, and run it:
```
    INSERT INTO Shippers    (CompanyName, Phone)    VALUES ('Wrox Deliveries', '(504) 666-6836') 
```
As you can see from the above screenshot, we have appended one row. We can examine our new row by typing in the following query. Don't worry about the format for now; we will explain it shortly in the How It Works section.
```
    SELECT      *    FROM      Shippers 
```
On running this new query, we find our new row in the query results:

How It Works

In an INSERT query, the INSERT clause contains the keywords INSERT INTO (where INTO is optional) and the list of fields that will receive values. The VALUES clause contains the values that will get placed into the fields; the field values should be in the same order as the field list.

Once the new row has been inserted, we examine it with the following syntax:

    SELECT      *    FROM      Shippers

The obvious thing to notice here is that in the SELECT clause, there's an asterisk in place of the field names you'd expect to see. An asterisk in the SELECT clause is a shorthand way to retrieve all of the fields for a particular row. This can be useful when you want to examine data quickly, or while you are developing your queries. Generally speaking, however, you should not use this syntax in your applications, as it provides no control over the order in which fields are retrieved.

Note

Avoid the SELECT * syntax in applications; use the SELECT field1, field2, ... syntax instead.

Notice that we didn't provide a value for the ShipperID field, and yet it has a new value - 4, in this case. This is because the designers of the Northwind database designated ShipperID as an identity field (Microsoft Access calls this an "AutoNumber" field). An identity field is a numeric field that automatically receives a new, unique value whenever a new row is added to its table. If you attempt to assign you own value to an identity field, you get an error.

click to expand

Try It Out - UPDATE Statements

Let's now change the value of the phone number in the row we just added by using an UPDATE statement. Type the following into the Query1 window (you may need to use a different value for the ShipperID), and run the query.
```
    UPDATE Shippers    SET Phone = '(503) 555-6836'    WHERE ShipperID = 4 
```
As you see from the screenshot below, we have again altered one row.

Our row now has the new phone number:

    SELECT      *    FROM      Shippers

click to expand

How It Works

This UPDATE statement looks straightforward and, for the most part, it is. The UPDATE clause includes the name of the table we want to update. The SET clause contains the field(s) and their new values (if you want to update multiple fields at the same time, separate each 'field = value' pair by a comma). Again we see the WHERE clause, just like in our other examples. Here, it ensures that we only affect the row that we just added, with ShipperID = 4.

Note

Be careful with UPDATE (and DELETE) statements! If you forget to add the WHERE clause in a SELECT statement, the query returns more rows than you wanted. But if you forget to add the WHERE clause in an UPDATE or DELETE statement, the statement will modify or delete every row in the whole table!

Try It Out - DELETE Statements

Finally, we can use the DELETE statement to delete our new row from the Shippers table:
```
 DELETE FROM Shippers WHERE ShipperID = 4 
```
As you see in the screenshot below, we have again affected one row.

Once again, the basic DELETE syntax is simple, but don't forget the WHERE clause! If you leave out the WHERE clause, you will delete every row in the table!