Using SQL | Mastering the SAP Business Information Warehouse: Leveraging the Business Intelligence Capabilities of SAP NetWeaver

SQL (pronounced sequel) is the standard language for accessing relational databases. As you'll see in this chapter, SQL is easy to learn and use. With SQL, you tell the database what data you want to access, and the database software figures out exactly how to get that data.

There are many types of SQL statements, but the most commonly used types of SQL statements are these:

Data Manipulation Language (DML) statements
Data Definition Language (DDL) statements

DML statements allow you to retrieve, add, modify, and delete rows stored in the database. DDL statements allow you to create database structures such as tables.

Before you learn the basics of DML statements, you need to know how you can enter and run SQL statements. You can enter and run SQL statements against a SQL Server database using the Query Analyzer tool, and you'll learn about this next.

Note

As you'll see later in the "Accessing a Database Using Visual Studio .NET" section, you can also use Visual Studio .NET to create SQL statements. Visual Studio .NET enables you to create SQL statements visually, as well as entering them manually.

Using Query Analyzer

You use Query Analyzer to enter and run SQL statements. You start Query Analyzer by selecting Start ➣ Microsoft SQL Server ➣ Query Analyzer. In the following sections, you'll learn how to connect to a SQL server instance, enter and run a SQL statement, save a SQL statement, and load one.

Connecting to a SQL Server Instance

When you start Query Analyzer, the first thing it displays is the Connect to SQL Server dialog box, as shown in Figure 3.1. In the SQL Server field, you enter the name of the SQL Server instance to which you want to connect. You can click the drop-down list and select an instance of SQL Server, or you can click the ellipsis button (three dots …) to the right of the drop-down list to display a list of SQL Server instances running on your network.

Figure 3.1: Connecting to a SQL Server database

If you select the Windows authentication radio button, then SQL Server will use the Windows 2000/NT user information to validate your request to connect to SQL Server. If you select the SQL Server authentication radio button, then you will need to enter a login name and password.

In Figure 3.1, I've entered localhost in the SQL Server field; this corresponds to the instance of SQL Server installed on the local computer. I've also selected the SQL Server authentication radio button, and entered sa in the Login Name field and sa in the Password field (this is the password I used when installing SQL Server). These details are then used to connect to SQL Server. If you have an instance of SQL Server running on your local computer or on your network, you may enter the relevant details and click the OK button to connect to SQL Server.

Now that you've seen how to connect to the database, let's take a look at how you enter and run a SQL statement.

Entering and Running a SQL Statement

Once you've connected to SQL Server using Query Analyzer, you can use the Object Browser to view the parts of a database, and you enter and run SQL statements using a Query window. Figure 3.2 shows the Object Browser and an example Query window, along with the results of retrieving the CustomerID and CompanyName columns from the Customers table.

click to expand
Figure 3.2: Viewing database items using the Object Browser and executing a SELECT statement using the Query window

As you can see from Figure 3.2, you enter SQL statements into the top part of the Query window, and the results retrieved from the database are displayed in the bottom part. You specify the database to access with the USE statement, and you retrieve rows from the database using the SELECT statement.

Tip

You can also specify the database to access by using the drop-down list on the toolbar.

If you want to follow along with this example, go ahead and enter the following USE statement into your Query window:

 USE Northwind

This USE statement indicates that you want to use the Northwind database. Next, on a separate line, enter the following SELECT statement:

 SELECT CustomerID, CompanyName FROM Customers;

This SELECT statement indicates that you want to retrieve the CustomerID and CompanyName columns from the Customers table.

Note

SELECT and FROM are SQL keywords. Although SQL isn't case sensitive, I use uppercase when specifying SQL keywords and mixed case when specifying column and table names. You may terminate a SQL statement using a semicolon (;), although this isn't mandatory.

You can run the SQL statement entered in the Query window in five ways:

Selecting Execute from the Query menu
Clicking the Execute Query button (green triangle) on the toolbar
Pressing the F5 key on the keyboard
Pressing Ctrl+E on the keyboard
Pressing Alt+X on the keyboard

Once you run the SQL statement, your statement is sent to the database for execution. The database runs your statement and sends results back. These results are then displayed in the bottom of your Query window.

Saving and Loading a SQL Statement

You can save a SQL statement previously entered into Query Analyzer as a text file. Later, you can load and run the SQL statement saved in that file. You can save a SQL statement by

Selecting Save or Save As from the File menu
Clicking the Save Query/Result button (disk) on the toolbar
Pressing Ctrl+S on the keyboard

When you do any of these, the Query Analyzer opens the Save Query dialog box. Let's say you save the file as CustomerSelect.sql. Once you've saved the file, you can open it by

Selecting Open from the File menu
Clicking the Load SQL Script button (open folder) on the toolbar
Pressing Ctrl+Shift+P on the keyboard

When you do any of these, the Query Analyzer opens the Open Query File dialog box. Let's say you open CustomerSelect.sql. Once you've opened a query file, you can run it using one of the techniques described earlier.

Understanding Data Manipulation Language (DML) Statements

As mentioned earlier, DML statements enable you to retrieve, add, modify, and delete rows stored in database tables. There are four types of DML statements:

SELECT Retrieves rows from one or more tables.
INSERT Adds one or more new rows to a table.
UPDATE Modifies one or more rows in a table.
DELETE Removes one or more rows from a table.

You'll learn how to use these four statements in the following sections.

Retrieving Rows From a Single Table

You use the SELECT statement to retrieve rows from tables. The SELECT statement has many forms, and the simplest version allows you to specify a list of columns and the table name. For example, the following SELECT statement retrieves the CustomerID, CompanyName, ContactName, and Address columns from the Customers table:

 SELECT CustomerID, CompanyName, ContactName, Address FROM Customers;

The columns to retrieve are specified after the SELECT keyword, and the table is specified after the FROM keyword.

If you want to retrieve all columns from a table, specify the asterisk character (*) immediately after the SELECT keyword.

Tip

To avoid retrieving more information than you need, rather than use *, list only the columns you actually want.

For example, the following SELECT statement retrieves all the columns from the Customers table using *:

 SELECT * FROM Customers;

Figure 3.3 shows the results of this SELECT statement.

click to expand
Figure 3.3: Using a SELECT statement to retrieve rows from the Customers table

To retrieve rows from a table containing a space in its name, you place that table name in square brackets. For example, the following SELECT statement retrieves rows from the Order Details table:

 SELECT * FROM [Order Details];

Note

You can also use square brackets when you have a column with a name that contains a space.

Restricting Retrieved Rows

You use the WHERE clause to restrict the rows retrieved by a SELECT statement. For example, the following SELECT statement uses a WHERE clause to restrict the rows retrieved from the Customers table to those where the Country column is equal to 'UK':

 SELECT CustomerID, CompanyName, City FROM Customers WHERE Country = 'UK';

Figure 3.4 shows the results of this SELECT statement.

click to expand
Figure 3.4: Using a WHERE clause to restrict rows from the Customers table to those where Country is equal to 'UK'

The next SELECT statement uses a WHERE clause to restrict the row retrieved from the Products table to the one where ProductID is equal to 10:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID = 10;

The equal operator (=) is not the only operator you can use in a WHERE clause. Table 3.1 shows other mathematical operators you can use.

Table 3.1: SQL MATHEMATICAL OPERATORS
OPERATOR	DESCRIPTION
=	Equal
<> or !=	Not equal
<	Less than
>	Greater than
<=	Less than or equal
>=	Greater than or equal

The following SELECT statement uses the less-than-or-equal operator (<=) to retrieve the rows from the Products table where the ProductID column is less than or equal to 10:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID <= 10;

The next SELECT statement uses the not-equal operator (<>) to retrieve the rows from the Products table where the ProductID column is not equal to 10:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID <> 10;

Performing Pattern Matching

You use the LIKE operator in a WHERE clause to perform pattern matching. You specify one or more wildcard characters to use in your pattern matching string. Table 3.2 lists the wildcard characters.

Table 3.2: WILDCARD CHARACTERS
CHARACTERS	DESCRIPTION
_	Matches any one character. For example, J_y matches Joy and Jay.
%	Matches any number of characters. For example, %wind matches Northwind and Southwind; %fire% matches starfire, firestarter, and fireman.
[ ]	Matches any one character in the brackets. For example, [sm]ay matches say and may.
[^ ]	Matches any one character not in the brackets. For example, [^a] matches any character except a.
[ - ]	Matches a range of characters. For example, [a-c]bc matches abc, bbc, and cbc.
#	Matches any one number. For example, A# matches A1 through A9.

Let's take a look at some examples that use some of the wildcard characters shown in Table 3.2. The following SELECT statement uses the LIKE operator to retrieve products where the ProductName column is like 'Cha_':

 SELECT ProductID, ProductName FROM Products WHERE ProductName LIKE 'Cha_';

Figure 3.5 shows the results of this SELECT statement. LIKE 'Cha_' matches products with names that start with Cha and end with any one character.

click to expand
Figure 3.5: Products where ProductName is like 'Cha_'

The next SELECT statement uses the LIKE operator to retrieve products where the ProductName column is like 'Cha%':

 SELECT ProductID, ProductName FROM Products WHERE ProductName LIKE 'Cha%';

Figure 3.6 shows the results of this SELECT statement. LIKE 'Cha%' matches products with names that start with Cha and end with any number of characters.

click to expand
Figure 3.6: Products where ProductName is like 'Cha%'

The next SELECT statement uses the LIKE operator to retrieve products where the ProductName column is like '[ABC]%':

 SELECT ProductID, ProductName FROM Products WHERE ProductName LIKE '[ABC]%';

Figure 3.7 shows the results of this SELECT statement. LIKE '[ABC]%' matches products with a name that starts with any of the letters A, B, or C, and ends with any number of characters.

click to expand
Figure 3.7: Products where ProductName is like '[ABC]%'

The next SELECT statement uses the LIKE operator to retrieve products where the ProductName column is like '[^ABC]%':

 SELECT ProductID, ProductName FROM Products WHERE ProductName LIKE '[^ABC]%';

Figure 3.8 shows the results of this SELECT statement. LIKE '[^ABC]%' matches products with names that don't start with any of the letters A, B, or C, and end with any number of characters.

click to expand
Figure 3.8: Products where ProductName is like '[^ABC]%'

The next SELECT statement uses the LIKE operator to retrieve products where the ProductName column is like '[A-E]%':

 SELECT ProductID, ProductName FROM Products WHERE ProductName LIKE '[A-E]%';

Figure 3.9 shows the results of this SELECT statement. LIKE '[A-E]%' matches products with names that start with any of the letters A through E, and end with any number of characters.

click to expand
Figure 3.9: Products where ProductName is like '[A-E]%'

Specifying a List of Values

You use the IN operator in a WHERE clause to retrieve rows with columns that contain values in a specified list. For example, the following SELECT statement uses the IN operator to retrieve products with a ProductID of 1, 2, 5, 15, 20, 45, or 50:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID IN (1, 2, 5, 15, 20, 45, 50);

Here's another example that displays the OrderID column from the Orders table for the rows where the CustomerID column is in the list retrieved by a subquery; the subquery retrieves the CustomerID column from the Customers table where the CompanyName is like 'Fu%':

 SELECT OrderID FROM Orders WHERE CustomerID IN (   SELECT CustomerID   FROM Customers   WHERE CompanyName LIKE 'Fu%' );

The results of the subquery are used in the outer query.

Specifying a Range of Values

You use the BETWEEN operator in a WHERE clause to retrieve rows with columns that contain values in a specified range. For example, the following SELECT statement uses the BETWEEN operator to retrieve products with a ProductID between 1 and 12:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID BETWEEN 1 AND 12;

Here's another example that displays the OrderID column for the rows from the Orders table where the OrderDate is between '1996-07-04' and '1996-07-08':

 SELECT OrderID FROM Orders WHERE OrderDate BETWEEN '1996-07-04' AND '1996-07-08';

Reversing the Meaning of an Operator

You use the NOT keyword with an operator in a WHERE clause to reverse the meaning of that operator. For example, the following SELECT statement uses the NOT keyword to reverse the meaning of the BETWEEN operator:

 SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID NOT BETWEEN 1 AND 12;

Note

You can use the NOT keyword to reverse other operators, for example, NOT LIKE, NOT IN.

Retrieving Rows with Columns Set to Null

Earlier, I mentioned that columns can contain null values. A null value is different from a blank string or zero: A null value represents a value that hasn't been set, or is unknown. You can use the IS NULL operator in a WHERE clause to determine if a column contains a null value. For example, the following SELECT statement uses the IS NULL operator to retrieve customers where the Fax column contains a null value:

 SELECT CustomerID, CompanyName, Fax FROM Customers WHERE Fax IS NULL;

Figure 3.10 shows the results of this SELECT statement.

click to expand
Figure 3.10: Using the IS NULL operator to retrieve customers where Fax contains a null value

As you can see, null values are displayed as NULL in the Query Analyzer.

Specifying Multiple Conditions

You can use the logical operators shown in Table 3.3 to specify multiple conditions in a WHERE clause.

Table 3.3: LOGICAL OPERATORS
OPERATOR	DESCRIPTION
a AND b	Evaluates to true when a and b are both true
a OR b	Evaluates to true when either a or b are true
NOT a	Evaluates to true if a is false, and false if a is true

For example, the following SELECT statement uses the AND operator to retrieve products where the UnitsInStock column is less than 10 and the ReorderLevel column is less than or equal to 20:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products WHERE UnitsInStock < 10 AND ReorderLevel <= 20;

Figure 3.11 shows the results of this SELECT statement.

click to expand
Figure 3.11: Using the AND operator to retrieve products where UnitsInStock is less than 10 and ReorderLevel is less than or equal to 20

In the next example, the SELECT statement uses the OR operator to retrieve products where either the UnitsInStock column is less than 10 or the ReorderLevel column is less than or equal to 20:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products WHERE UnitsInStock < 10 OR ReorderLevel <= 20;

Figure 3.12 shows the results of this SELECT statement.

click to expand
Figure 3.12: Using the OR operator to retrieve products where either UnitsInStock is less than 10 or ReorderLevel is less than or equal to 20

The next SELECT statement uses the NOT operator to retrieve products where the UnitsInStock column is not less than 10:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products WHERE NOT (UnitsInStock < 10);

Sorting Rows

You can use the ORDER BY clause to sort rows retrieved from the database. You specify the column (or columns) to sort in the ORDER BY clause. By default, rows are sorted in ascending order. For example, the following SELECT statement orders the rows using the ProductName column:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products ORDER BY ProductName;

Figure 3.13 shows the results of this SELECT statement. As you can see, the rows are ordered in ascending order using the ProductName column.

click to expand
Figure 3.13: Using the ORDER BY clause to order products by ascending ProductName

You can explicitly state the order for a column using the ASC or DESC keyword. ASC orders the columns in ascending order (smallest item first), and DESC orders the columns in descending order (largest item first). For example, the following SELECT statement orders the products in descending order using the ProductName column:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products ORDER BY ProductName DESC;

You can specify multiple columns in an ORDER BY clause. For example, the following SELECT statement orders the rows using both the UnitsInStock and ReorderLevel columns:

 SELECT ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products ORDER BY UnitsInStock DESC, ReorderLevel ASC;

Figure 3.14 shows the results of this SELECT statement. As you can see, the rows are ordered by the UnitsInStock column first (in descending order), and then by the ReorderLevel column (in ascending order).

click to expand
Figure 3.14: Using the DESC and ASC keywords to order products by descending UnitsInStock and ascending ReorderLevel

Retrieving the Top N Rows

You use the TOP keyword to just retrieve the top N rows from a SELECT statement. For example, the following SELECT statement uses the TOP keyword to retrieve the top 10 rows from the Products table, ordered by the ProductID column:

 SELECT TOP 10 ProductID, ProductName, UnitsInStock, ReorderLevel FROM Products ORDER BY ProductID;

Note

I've also used the optional ORDER BY clause in this example SELECT statement to order the rows by the ProductID column.

Figure 3.15 shows the results of this SELECT statement.

click to expand
Figure 3.15: Using the TOP keyword to retrieve the top 10 products by ProductID

Eliminating Duplicate Rows

You use the DISTINCT keyword to eliminate duplicate rows retrieved by a SELECT statement. For example, the following SELECT statement uses the DISTINCT keyword to retrieve the distinct Country column values from the Customers table:

 SELECT DISTINCT Country FROM Customers;

Figure 3.16 shows the results of this SELECT statement.

Figure 3.16: Using the DISTINCT keyword to retrieve distinct Country column values

As you can see, the SELECT statement only displays Country column values that are unique: duplicate values are eliminated. If you didn't include the DISTINCT keyword, then all the Country column values would be displayed.

Combining Retrieved Rows From SELECT Statements

You use the UNION operator to combine retrieved rows from SELECT statements into one set of rows. For example, the following SELECT statement uses the UNION operator to combine the retrieved rows from two SELECT statements that retrieve rows from the Products table; the first retrieves rows where the ProductID is less than or equal to 5, and the second retrieves rows where the ProductName starts with Queso:

 (SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductID <= 5) UNION (SELECT ProductID, ProductName, QuantityPerUnit, UnitPrice FROM Products WHERE ProductName LIKE 'Queso%');

Figure 3.17 shows the results of this statement.

click to expand
Figure 3.17: Using the UNION operator to combine retrieved rows from two SELECT statements

Dividing Retrieved Rows into Blocks

You use the GROUP BY clause to divide retrieved rows into blocks. You can think of a block as a group of rows that have been condensed into one row. For example, let's say you grouped the SupplierID column of the rows from the Products table. You would get one row for every row that had the same SupplierID column value. The following SELECT statement uses the GROUP BY clause to divide the SupplierID column values into blocks:

 SELECT SupplierID FROM Products GROUP BY SupplierID;

This SELECT statement displays one row for each group of rows that have the same SupplierID column value. You can get the number of rows in each block using the COUNT() function. COUNT() is one of the functions that come built into SQL Server, and is known as an aggregate function because it can operate on more than one row at a time. You use COUNT(*) to get the number of rows, as shown in the following example that retrieves the SupplierID and number of rows for each group of SupplierID column values:

 SELECT SupplierID, COUNT(*) FROM Products GROUP BY SupplierID;

Figure 3.18 shows the results of this SELECT statement.

Figure 3.18: Using the GROUP BY clause to divide rows into blocks

You'll learn more about the various SQL Server functions in the next chapter.

Restricting Retrieved Groups of Rows

You use the HAVING clause to restrict the groups of rows retrieved by the GROUP BY clause. For example, the following SELECT statement uses the HAVING clause to restrict the group of rows returned to those that have more than 4 rows in each group:

 SELECT SupplierID, COUNT(*) FROM Products GROUP BY SupplierID HAVING COUNT(*) > 4;

Figure 3.19 shows the results of this SELECT statement.

Figure 3.19: Using the HAVING clause to restrict retrieved groups of rows

Specifying the Display Name for a Column and Aliasing a Table

You can use the AS clause to specify the name of a column when it is displayed in the output from a SELECT statement. You might want to do this when you need to display more friendly names or descriptive names for columns. For example, the following SELECT statement uses the AS clause to set the display name of the ProductName column to Product, and the UnitPrice column to Price for each unit:

 SELECT ProductName AS Product, UnitPrice AS 'Price for each unit' FROM products;

Figure 3.20 shows the results of this SELECT statement.

click to expand
Figure 3.20: Using the AS clause to specify the display name for columns

You can also use the AS clause to alias a table. You might want to do this if your table names are long. The following example uses the AS clause to alias the Customers and Orders tables as Cust and Ord respectively:

 SELECT Cust.CustomerID, CompanyName, Address, OrderID, ShipAddress FROM Customers AS Cust, Orders AS Ord WHERE Cust.CustomerID = Ord.CustomerID AND Cust.CustomerID = 'ALFKI';

Performing Computations Based on Column Values

You typically use calculated fields to perform computations based on column values. For example, you might want to use a calculated field to compute the effect of increasing the UnitPrice column of the Products table by 20 percent. The following SELECT statement shows this:

 SELECT UnitPrice * 1.20 FROM Products WHERE ProductID = 1;

This example returns 21.600000. The new unit price is calculated using UnitPrice * 1.20. This is an increase of 20 percent over the current unit price.

The next example concatenates the ContactName and ContactTitle columns from the Customers table for the row where the CustomerID equals ALFKI:

 SELECT ContactName + ', ' + ContactTitle FROM Customers WHERE CustomerID = 'ALFKI';

This example returns Maria Anders, Sales Representative.

Retrieving Rows From Multiple Tables

So far, you've seen SELECT statements that retrieve rows from only one table at a time. You'll often need to retrieve rows from multiple tables using the same SELECT statement.

For example, you might want to see all the orders placed by a customer. To do this, you must specify both the Customers and the Orders tables after the FROM keyword in the SELECT statement and use a table join in the WHERE clause. You must also specify the name of the table when referencing columns of the same name in both tables. The following SELECT statement shows this and retrieves the orders placed by the customer with a CustomerID of ALFKI:

 SELECT Customers.CustomerID, CompanyName, Address, OrderID, ShipAddress FROM Customers, Orders WHERE Customers.CustomerID = Orders.CustomerID AND Customers.CustomerID = 'ALFKI';

Notice that the Customers and Orders tables are specified after the FROM keyword, and because both tables contain a column named CustomerID, the table name is placed before each reference to the respective column in each table. The table join is done on the CustomerID column of each table (Customers.CustomerID = Orders.CustomerID).

Figure 3.21 shows the results of this SELECT statement.

click to expand
Figure 3.21: Using a multitable SELECT statement to retrieve orders placed by a specific customer

The previous SELECT statement used the SQL standard format for joining tables. With SQL Server, you can also use the JOIN keyword for joining tables. The advantage of the JOIN keyword is you can use it to perform outer joins, which you'll learn about shortly. Here's an example that rewrites the previous SELECT statement using the JOIN keyword:

 SELECT Customers.CustomerID, CompanyName, Address, OrderID, ShipAddress FROM Customers JOIN Orders ON Customers.CustomerID = Orders.CustomerID AND Customers.CustomerID = 'ALFKI';

This SELECT statement returns the same results as the previous example.

The disadvantage of the previous two SELECT statements is that they return rows only where the join columns both contain a value, that is, neither column contains a null. This can be a problem if you have rows that have a null value in either of the columns used in the join and you need to actually retrieve those rows. Outer joins solve this problem. There are three types of outer joins:

LEFT OUTER JOIN The LEFT OUTER JOIN (usually shortened to LEFT JOIN) returns all the rows from the table on the left of the join, including those with a column that contains a null value.
RIGHT OUTER JOIN The RIGHT OUTER JOIN (usually shortened to RIGHT JOIN) returns all the rows from the table on the right of the join, including those with a column that contains a null value.
FULL OUTER JOIN The FULL OUTER JOIN (usually shortened to FULL JOIN) returns all the rows from the tables on the left and right of the join, including those with a column that contains a null value.

Let's take a look at a couple of examples. First, perform the following INSERT to add a row to the Products table:

 INSERT INTO Products (ProductName, SupplierID) VALUES ('DVD Player', NULL);

Note

You'll learn the details of the INSERT statement later in this chapter.

You don't need to specify the ProductID column because SQL Server will automatically supply a value using an identity. This identity was established when the Products table was created, and the identity generates a series of values that start with 1 and are incremented by 1 each time it is used. For example, the ProductID column initially contains a series of values from 1 to 77, therefore the next INSERT statement that adds a row to the Products table will set the ProductID column to 78 for that row-the next identity value.

You'll notice that the SupplierID column in the INSERT statement is null. If you now perform the following SELECT statement, you won't see the new row because the SupplierID column of the new row is null and the JOIN won't return that row:

 SELECT ProductID FROM Products JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID;

To see the new row, you use LEFT JOIN in the SELECT statement to retrieve all rows from the table on the left of the join (in this case, the table on the left is the Products table):

 SELECT ProductID FROM Products LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID;

You can also use LEFT JOIN with IS NULL in the same SELECT statement to retrieve just the new row:

 SELECT ProductID FROM Products LEFT JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID WHERE Products.SupplierID IS NULL;

Retrieving Rows From a View

You use a view to retrieve a set of columns from one or more tables. You can think of a view as a more flexible way of examining the rows stored in the tables. For example, one of the views of the Northwind database retrieves an alphabetical list of products, and retrieves the product name and category name, among other columns. This information comes from both the Products and Categories tables. This view is named Alphabetical list of products and the SELECT statement that makes up this view is as follows:

 SELECT Products.*, Categories.CategoryName FROM Categories INNER JOIN Products ON  Categories.CategoryID = Products.CategoryID WHERE (((Products.Discontinued)=0));

You can retrieve all columns and rows from the underlying tables referenced by this view using the following SELECT statement:

 SELECT * FROM [Alphabetical list of products];

You can also retrieve individual columns from a view. For example, the following SELECT statement retrieves just the ProductName and CategoryName columns from the view:

 SELECT ProductName, CategoryName FROM [Alphabetical list of products];

Adding a New Row to a Table

You use the INSERT statement to add a new row to a table. When adding a new row, you specify the name of the table, the optional column names, and the values for those columns. For example, the following INSERT statement adds a new row to the Customers table:

 INSERT INTO Customers (   CustomerID, CompanyName, ContactName, ContactTitle, Address,   City, Region, PostalCode, Country, Phone, Fax ) VALUES (   'JPCOM', 'Jason Price Company', 'Jason Price', 'Owner', '1 Main Street',   'New York', NULL, '12345', 'USA', '(800)-555-1212', NULL );

The CustomerID column is the primary key of the Customers table, therefore the new row must contain a unique value for this column. You'll notice that the INSERT statement specifies a null value for the Region and Fax columns (this is specified using the NULL keyword).

You can use the Query Analyzer to enter INSERT statements. Figure 3.22 shows the previous INSERT, along with a SELECT statement that retrieves the new row.

click to expand
Figure 3.22: Using an INSERT statement to add a new row to the Customers table

Note

You must supply values for all columns that are defined as NOT NULL in a table. Also, the number of columns in the INSERT and VALUES lists must match, and the data type of each column in the INSERT and VALUES lists must also match.

When supplying values to all columns in a row, you may omit the column names and just supply the values for each column. For example:

 INSERT INTO Customers VALUES (   'CRCOM', 'Cynthia Red Company', 'Cynthia Red', 'Owner', '2 South Street',   'New York', NULL, '12345', 'USA', '(800)-555-1212', NULL );

Modifying Rows in a Table

You use the UPDATE statement to update rows in a table. When updating a row, you specify the name of the table, the columns to update, and the new values for the columns.

Warning

Typically, you should also use a WHERE clause to restrict the rows being updated. If you don't supply a WHERE clause, then all the rows in the specified table will be updated. In many cases, you'll specify the value for the primary key in your WHERE clause.

The following UPDATE statement modifies the Address column for the row in the Customers table with a CustomerID of JPCOM:

 UPDATE Customers SET Address = '3 North Street' WHERE CustomerID = 'JPCOM';

Figure 3.23 shows this UPDATE statement, along with a SELECT statement that retrieves the modified row.

click to expand
Figure 3.23: Using an UPDATE statement to modify the Address column of a row in the Customers table

You can use an UPDATE statement to modify multiple columns. For example, the following UPDATE statement modifies the Address and ContactTitle columns:

 UPDATE Customers SET Address = '5 Liberty Street', ContactTitle = 'CEO' WHERE CustomerID = 'JPCOM';

Removing Rows From a Table

You use the DELETE statement to remove rows from a table. When removing a row, you specify the name of the table and the rows to delete using a WHERE clause.

Warning

If you omit the WHERE clause in a DELETE statement, all rows from the table will be deleted. Make sure you provide a WHERE clause if you don't want to remove all the rows from the table. Typically, you'll specify the value for the primary key in your WHERE clause.

The following DELETE statement removes the row from the Customers table where the CustomerID is CRCOM:

 DELETE FROM Customers WHERE CustomerID = 'CRCOM';

Figure 3.24 shows this DELETE statement, along with a SELECT statement that demonstrates that the row has been removed.

click to expand
Figure 3.24: Using an UPDATE statement to remove a row from the Customers table

In the next section, you'll learn how the database software maintains the integrity of the information stored in the database.

Maintaining Database Integrity

The database software ensures that the information stored in the tables is consistent. In technical terms, it maintains the integrity of the information. Two examples of this are the following:

The primary key of a row always contains a unique value.
The foreign key of a row in the child table always references a value that exists in the parent table.

Let's take a look at what happens when you try to insert a row into a table with a primary key that already exists. The following INSERT statement attempts to add a row to the Customers table with a CustomerID of ALFKI (a row with this primary key already exists in the Customers table):

 INSERT INTO Customers (   CustomerID, CompanyName, ContactName, ContactTitle, Address,   City, Region, PostalCode, Country, Phone, Fax ) VALUES (   'ALFKI', 'Jason Price Company', 'Jason Price', 'Owner', '1 Main Street',   'New York', NULL, '12345', 'USA', '(800)-555-1212', NULL );

If you attempt to run this INSERT statement, you'll get the following error message from the database:

 Violation of PRIMARY KEY constraint 'PK_Customers'. Cannot insert duplicate key in object 'Customers'. The statement has been terminated.

This INSERT statement fails because an existing row in Customers table already contains the primary key value ALFKI. The message tells you that the primary key specified in the INSERT statement already exists in the Customers table. The constraint name PK_Customers is the name of the table constraint assigned to the primary key when the Customers table was originally created. At the end, the message indicates that the statement has been terminated, meaning that the INSERT statement has not been performed.

Let's take a look at what happens when you try to modify a primary key in a parent table with a value that is referenced in a foreign key in a child table. The following UPDATE statement attempts to modify the CustomerID from ALFKI to ALFKZ in the parent Customers table (this row is referenced by rows in the child Orders table):

 UPDATE Customers SET CustomerID = 'ALFKZ' WHERE CustomerID = 'ALFKI';

If you attempt to run this UPDATE statement, you'll get the following error message:

 UPDATE statement conflicted with COLUMN REFERENCE constraint 'FK_Orders_Customers'. The conflict occurred in database 'Northwind', table 'Orders', column 'CustomerID'. The statement has been terminated.

This UPDATE statement fails because the row containing the primary key value ALFKI is referenced by rows in the Orders table. The message tells you that the new value for the CustomerID column violates the foreign key constraint on the CustomerID column of the Orders table. This constraint is named FK_Orders_Customers.

Also, you can't remove a row from a parent table that is referenced by a row in a child table. For example, the following DELETE statement attempts to remove the row from the Customers table where the CustomerID column equals ALFKI (this row is referenced by rows in the Orders table):

 DELETE FROM Customers WHERE CustomerID = 'ALFKI';

If you attempt to run this DELETE statement, you'll get the same error message that was shown for the previous UPDATE statement. This DELETE statement fails because the Orders table contains rows that reference the row in the Customers table, and removing this row would make the database inconsistent because the rows in the Orders table wouldn't reference a valid row.

Grouping SQL Statements

By default, when you run an INSERT, UPDATE, or DELETE statement, SQL Server permanently records the results of the statement in the database. This might not always be your desired result. For example, in the case of a banking transaction, you might want to withdraw money from one account and deposit it into another. If you had two separate UPDATE statements that performed the withdrawal and deposit, then you would want to make the results of each UPDATE statement permanent only as one unit. If either UPDATE failed for some reason, then you would want to undo the results of both UPDATE statements.

Note

Permanently recording the results of SQL statements is known as a commit, or committing the SQL statements. Undoing the results of SQL statements is known as a rollback, or rolling back the SQL statements.

You can group SQL statements into a transaction. You can then commit or roll back the SQL statements in that transaction as one unit. For example, the two UPDATE statements in the previous banking example could be placed into a transaction, and then you could commit or roll back that transaction as one unit, depending on whether both of the UPDATE statements succeeded.

You start a transaction using the BEGIN TRANSACTION statement or the shorthand version, BEGIN TRANS. You then perform your SQL statements that make up the transaction. To commit the transaction, you perform a COMMIT TRANSACTION statement or one of the shorthand versions, COMMIT TRANS or COMMIT. To roll back the transaction, you perform a ROLLBACK TRANSACTION statement or one of the shorthand versions, ROLLBACK TRANS or ROLLBACK.

Note

By default, transactions are rolled back. You should always explicitly commit or roll back a transaction to indicate what you want to do.

Let's take a look at an example. The following transaction consists of two INSERT statements: the first adds a row to the Customers table, and the second adds a row to the Orders table. At the end, the transaction is committed using a COMMIT statement:

 BEGIN TRANSACTION; INSERT INTO Customers (   CustomerID, CompanyName ) VALUES (   'SOCOM', 'Steve Orange Company' ); INSERT INTO Orders (   CustomerID ) VALUES (   'SOCOM' ); COMMIT;

Figure 3.25 shows this transaction, along with two SELECT statements that show the two new rows.

click to expand
Figure 3.25: Using a transaction

The next transaction consists of similar INSERT statements, except this time the transaction is rolled back using a ROLLBACK statement.

 BEGIN TRANSACTION; INSERT INTO Customers (   CustomerID, CompanyName ) VALUES (   'SYCOM', 'Steve Yellow Company' ); INSERT INTO Orders (   CustomerID ) VALUES (   'SYCOM' ); ROLLBACK;

Because the transaction is rolled back, the two rows added by the INSERT statements are undone.

You should check for errors in a transaction before deciding to perform a COMMIT or ROLLBACK because errors do not always stop the next line from processing. To do this in SQL Server, you use the @@ERROR function. This function returns zero whenever a statement is executed and doesn't cause an error. If @@ERROR returns a nonzero value, you know an error occurred. If @@ERROR returns 0, you perform a COMMIT, otherwise you perform a ROLLBACK.

You can also assign a name to your transaction in the BEGIN TRANSACTION statement. This is useful as it shows which transaction you are working on.

The following example shows the naming of a transaction, along with the use of the @@ERROR function to determine whether to perform a COMMIT or ROLLBACK:

 BEGIN TRANSACTION MyTransaction; INSERT INTO Customers (   CustomerID, CompanyName ) VALUES (   'SYCOM', 'Steve Yellow Company' ); INSERT INTO Orders (   CustomerID ) VALUES (   'SYCOM' ); IF @@Error = 0   COMMIT TRANSACTION MyTransaction; ELSE   ROLLBACK TRANSACTION MyTransaction;

Notice that the name of the transaction is MyTransaction, and that this name is used in the COMMIT and ROLLBACK statements.

Note

You use the IF statement to conditionally execute a statement. You'll learn more about this in Chapter 4, "Introduction to Transact-SQL Programming."

Introducing Data Definition Language (DDL) Statements

As mentioned earlier, DDL statements allow you to create database structures such as tables and indexes. In this section, you'll learn how to create, alter, and drop a table, and create and drop an index.

Creating a Table

You create a table using the CREATE TABLE statement. For example, let's say you wanted to store the details for a number of persons in the database. Assume you want to store a person's first name, last name, and date of birth. Let's call this table Persons. You also want to uniquely identify each row in the Persons table using a numeric ID, which acts as the primary key for the table. The following CREATE TABLE statement creates the Persons table:

 CREATE TABLE Persons (   PersonID int CONSTRAINT PK_Persons PRIMARY KEY,   FirstName nvarchar(15) NOT NULL,   LastName nvarchar(15) NOT NULL,   DateOfBirth datetime );

You use the CONSTRAINT clause to restrict the values stored in a table or column. You'll notice that the CONSTRAINT clause is used to specify the primary key for the table using the keywords PRIMARY KEY. The primary key is the PersonID column, and this constraint is named PK_Persons. The ID column is an int, meaning that it stores integers. Every row in the Persons table must have a unique number for the PersonID column.

The FirstName and LastName columns are nvarchar columns that may store up to 15 characters. Both of these columns are defined using the NOT NULL constraint. NOT NULL indicates that you must supply a value to the column. The default is NULL, meaning that you don't have to supply a value to the column.

Note

Primary keys always require a value, and are therefore implicitly NOT NULL.

The DateOfBirth column is a datetime, meaning that it can store a date and time. This column doesn't have a NOT NULL constraint and therefore uses the default of NULL.

Altering a Table

You alter an existing table using the ALTER TABLE statement. You can add or drop a column, and add or drop a constraint using the ALTER TABLE statement. For example, the following ALTER TABLE statement adds a column named Address to the Persons table:

 ALTER TABLE Persons ADD Address nvarchar(50);

The Address column is an nvarchar that can store up to 50 characters.

The next example drops the Address column from the Persons table:

 ALTER TABLE Persons DROP COLUMN Address;

The next example adds a column named EmployerID to the Persons table, which records the company that a person works for:

 ALTER TABLE Persons ADD EmployerID nchar(5) CONSTRAINT FK_Persons_Customers REFERENCES Customers(CustomerID);

The EmployerID column is a foreign key to the CustomerID column of the Customers table. The constraint is named FK_Persons_Customers.

Dropping a Table

You drop a table from the database using the DROP TABLE statement. For example, the following statement drops the Persons table:

 DROP TABLE Persons;

Creating an Index

You add an index to a table using the CREATE INDEX statement. An index allows you to potentially find a row more rapidly when you use the column with the index in a WHERE clause. For example, the following CREATE INDEX statement adds an index to the LastName column of the Persons table:

 CREATE INDEX LastNameIndex ON Persons(LastName);

Tip

If you frequently use a column in a WHERE clause, you should consider adding an index to that column.

Generally, you should create an index on a column only when you find that you are retrieving a small number of rows from a table containing many rows. A good rule of thumb is that an index is useful when you expect any single query to retrieve 10 percent or less of the total rows in a table. This means that the candidate column for an index should be used to store a wide range of values. A good candidate for indexing would be a column containing a unique number for each record, while a poor candidate for indexing would be a column that contains only a small range of numeric codes, such as 1, 2, 3, or 4. This consideration applies to all database types, not just numbers.

Normally, the DBA is responsible for creating indexes, but as an application developer, you might know more about the application than the DBA and be able to recommend which columns are good candidates for indexing.

Dropping an Index

You drop an index from a table using the DROP INDEX statement. For example, the following DROP INDEX statement drops LastNameIndex from the Persons table:

 DROP INDEX Persons.LastNameIndex;