SQL Data Manipulation Language (DML) Statements | DB2 9 Fundamentals: Certification Study Guide

In Chapter 4, "Working with Databases and Database Objects," we looked at several of the database objects that are available with DB2 and we saw two ways in which each object can be created. After one or more table objects have been defined, the next step toward creating a functioning database is to populate those table objects with data. And once a table is populated, at some point, its data may need to be retrieved, modified, or removed. That's where the Data Manipulation Language (DML) statements come in. DML statements are used exclusively to store data in, manipulate data in, retrieve data from, and remove data from tables and updatable views. There are four Data Manipulation Language statements available: the INSERT statement, the UPDATE statement, the DELETE statement, and the SELECT statement. We'll take a close look at each of these statements in the following sections.

The INSERT Statement

When a table is first created, it is nothing more than a definition of how a set of data values are to be stored; there is no data associated with it. But after it has been created, a table can be populated in a variety of ways. It can be bulk-loaded using the DB2 Load utility, it can be bulk-loaded using the DB2 Import utility, or rows can be added to it, one at a time, by executing the INSERT SQL statement. Of the three, the INSERT statement is the method that is used the most. Moreover, the INSERT statement can work directly with the table to be populated, or it can work with an updatable view that references the table to be populated. The basic syntax for the INSERT statement is:

 INSERT INTO [TableName | ViewName] <([ColumnName], ...) > VALUES ([Value], ...)

 INSERT INTO [TableName | ViewName] <([ColumnName], ...) > [SELECTStatement]

where:

TableName	Identifies the name assigned to the table to which data is to be added.
ViewName	Identifies the name assigned to the updatable view to which data is to be added.
ColumnName	Identifies the name of one or more columns in the table or updatable view to which data values are to be assigned. Each name provided must identify an existing column in the table or updatable view specified.
Value	Identifies one or more data values that are to be added to the column(s) in the table or updatable view specified.
SELECTStatement	Identifies a SELECT SQL statement that, when executed, will produce the data values to be added to the column(s) in the table or updatable view specified (by retrieving data from other tables and/or views).

Thus, suppose you wanted to add a record to a base table named DEPARTMENT that has the following characteristics:

Open table as spreadsheet

Column Name	Data Type
DEPTNO	INTEGER
DEPTNAME	CHAR(20)
MGRID	INTEGER

You could do so by executing an INSERT statement that looks something like this:

 INSERT INTO department (deptno, deptname, mgrid)   VALUES (1, 'SALES', 1001)

It is important to note that the number of values provided in the VALUES clause must be equal to the number of column names provided in the column name list. Furthermore, the values provided will be assigned to the columns specified based upon the order in which they appear-in other words, the first value provided will be assigned to the first column identified in the column name list, the second value provided will be assigned to the second column identified, and so on. Finally, each value provided must be compatible with the data type of the column to which the value is to be assigned.

If values are provided with the VALUES clause for every column found in the table, the column name list can be omitted. In this case, the first value provided will be assigned to the first column found in the table, the second value provided will be assigned to the second column found, and so on. Thus, the row of data that was added to the DEPARTMENT table in the previous example could just as well have been added by executing the following INSERT statement:

 INSERT INTO department VALUES (1, 'SALES', 1001)

Along with literal values, two special tokens can be used to designate values that are to be assigned to base table columns. The first of these is the DEFAULT token, which is used to assign a system or user-supplied default value to an identity column or a column defined with the WITH DEFAULT constraint. The second is the NULL token, which is used to assign a NULL value to any column that was not defined with the NOT NULL constraint. (Identity columns, the WITH DEFAULT constraint, and the NOT NULL constraint are covered in detail in Chapter 6, "Working with DB2 Tables, Views, and Indexes.") Thus, you could add a record that contains a NULL value for the MGRID column to the DEPARTMENT table we looked at earlier by executing an INSERT statement that looks something like this:

 INSERT INTO department VALUES (001, 'SALES', NULL)

By using a different form of the INSERT SQL statement, the results of a query can also be used to provide values for one or more columns in a base table. With this form of the INSERT statement, a SELECT statement (known as a subselect) is provided in place of the VALUES clause (we'll look at the SELECT statement shortly), and the results of the subselect are assigned to the appropriate columns. This form of the INSERT statement creates a type of "cut and paste" action in which values are retrieved from one base table or view and copied into another. As you might imagine, the number of values returned by the subselect must match the number of columns provided in the column name list (or the number of columns found in the table if no column name list is provided), and the order of assignment is the same as that used when literal values are provided in a VALUES clause. Thus, you could add a record to the DEPARTMENT table we looked at earlier, using the results of a query, by executing an INSERT statement that looks something like this:

 INSERT INTO department (deptno, deptname) SELECT deptno, deptname FROM sales_depts

You may have noticed that the INSERT statement used in the previous example did not provide values for every column found in the DEPARTMENT table. Just as there are times you may want to insert complete records into a table, there may be times when you only want to insert partial records into a table. Such operations can be performed by listing just the columns you have data values for in the column names list and providing corresponding values using either the VALUES clause or a subselect. However, in order for such an INSERT statement to execute correctly, all columns in the table the record is being inserted into that do not appear in the column name list provided must either accept NULL values or have a default value associated with them. Otherwise the INSERT statement will fail and an error will be generated.

The UPDATE statement

Data residing in a database is rarely static. Over time, the need to change, or even remove, one or more values stored in a table can, and will, arise. In such situations, specific data values can be altered by executing the UPDATE SQL statement. The basic syntax for this statement is:

 UPDATE [TableName | ViewName] SET [[ColumnName]=[Value] | NULL | DEFAULT, ... ] <WHERE [Condition]>

 UPDATE [TableName | ViewName] SET ([ColumnName],...)=([Value] | NULL | DEFAULT, ...) <WHERE [Condition]>

 UPDATE [TableName | ViewName] SET ([ColumnName],...)=([SELECTStatement]) <WHERE [Condition]>

where:

TableName	Identifies the name assigned to the table that contains the data to be modified.
ViewName	Identifies the name assigned to the updatable view that references the table that contains the data to be modified.
ColumnName	Identifies the name of one or more columns that contain data values to be modified. Each name provided must identify an existing column in the table or updatable view specified.
Value	Identifies one or more data values that are to be used to replace existing values found in the column(s) specified.
SELECTStatement	Identifies a SELECT SQL statement that, when executed, will produce the data values to be used to replace existing values found in the column(s) specified (by retrieving data from other tables and/or views).
Condition	Identifies the search criterion that is to be used to locate one or more specific rows whose data values are to be modified. (This condition is coded like the WHERE clause that can be used with a SELECT SQL statement; we will look at the WHERE clause and its predicates later.) If no condition is specified, the update operation will be performed on every row found in the table or updatable view specified.

Tip

It is very important that you provide a proper WHERE clause whenever the UPDATE statement is used. Failure to do so will cause an update operation to be performed on every row found in the table or updatable view specified.

Therefore, suppose you have a base table named EMPLOYEES that has the following characteristics:

Open table as spreadsheet

Column Name	Data Type
EMPNO	INTEGER
FNAME	CHAR(20)
LNAME	CHAR(30)
TITLE	CHAR(10)
DEPARTMENT	CHAR(20)
SALARY	DECIMAL(6,2)

If you wanted to modify the records stored in EMPLOYEES such that the salary of every employee who has the title of DBA is increased by 10%, you could do so by executing an UPDATE statement that looks something like this:

 UPDATE employees SET salary = salary * 1.10 WHERE title = 'DBA'

The UPDATE statement can also be used to remove values from nullable columns. This is done by changing the column's current value to NULL. Thus, the value assigned to the DEPARTMENT column of the EMPLOYEES table shown in the previous example could be removed by executing an UPDATE statement that looks like this:

 UPDATE employees SET salary = NULL

Like the INSERT statement, the UPDATE statement can either work directly with the table that contains the values to be modified, or work with an updatable view that references the table containing the values to be modified. Similarly, the results of a query, or subselect, can be used to provide values for one or more columns identified in the column name list provided. As with the INSERT statement, this form of the UPDATE statement creates a type of "cut and paste" action in which values retrieved from one base table or view are used to modify values stored in another. And also like the INSERT statement, the number of values returned by the subselect must match the number of columns specified in the column name list provided. Thus, you could change the value assigned to the DEPT column of each record found in the EMPLOYEES table we looked at earlier, using the results of a query, by executing an UPDATE statement that looks something like this:

 UPDATE employees SET (dept) =(SELECT deptname FROM department WHERE deptno = 1)

It is important to note that update operations can be conducted in one of two ways: by performing a searched update operation or by performing a positioned update operation. So far, all of the examples we have looked at have been searched update operations. To perform a positioned update, a cursor must first be created, opened, and positioned on the row that is to be updated. Then the UPDATE statement that is to be used to modify one or more data values must contain a WHERE CURRENT OF [CursorName] clause (CursorName identifies the cursor being used-we'll look at cursors shortly). Because of their added complexity, positioned update operations are typically performed by embedded SQL applications.

The DELETE Statement

Although the UPDATE statement can be used to delete individual values from a base table or updatable view (by setting those values to NULL), it cannot be used to remove entire rows. When one or more complete rows of data need to be removed from a base table, the DELETE SQL statement must be used instead. As with the INSERT statement and the UPDATE statement, the DELETE statement can either work directly with the table that rows are to be removed from or work with an updatable view that references the table that rows are to be removed from. The basic syntax for the DELETE statement is:

 DELETE FROM [TableName | ViewName] <WHERE [Condition]>

where:

TableName	Identifies the name assigned to the table rows of data are to be removed from.
ViewName	Identifies the name assigned to the updatable view rows of data are to be removed from.
Condition	Identifies the search criterion to be used to locate one or more specific rows that are to be deleted. (This condition is coded as in the WHERE clause used with a SELECT SQL statement; we will look at the WHERE clause and its predicates later.) If no condition is specified, the delete operation will be performed on every row found in the table or updatable view specified.

Tip

Because omitting the WHERE clause in a DELETE SQL statement causes the delete operation to be applied to all rows in the table or view specified, it is important to always provide a WHERE clause with a DELETE statement unless you explicitly want to erase all data stored in a table.

Thus, suppose you wanted to remove every record for company XYZ from a base table named SALES that has the following characteristics:

Open table as spreadsheet

Column Name	Data Type
PO_NUMBER	CHAR(10)
COMPANY	CHAR(20)
PURCHASEDATE	DATE
SALESPERSON	INTEGER

You could do so by executing a DELETE statement that looks something like this:

 DELETE FROM sales WHERE company = 'XYZ'

On the other hand, if you wanted to remove every record for which no sales person was assigned from a base table named SALES, you could do so by executing a DELETE statement that looks something like this:

 DELETE FROM sales WHERE salesperson IS NULL

Like update operations, delete operations can be conducted in one of two ways: as searched delete operations or as positioned delete operations. To perform a positioned delete, a cursor must first be created, opened, and positioned on the row to be deleted. Then, the DELETE statement used to remove the row must contain a WHERE CURRENT OF [CursorName] clause (CursorName identifies the cursor being used). Because of their added complexity, positioned delete operations are typically performed by embedded SQL applications.

The SELECT Statement

Sooner or later, almost all database users and/or applications have the need to retrieve specific pieces of information (data) from the database they are interacting with. The operation used to retrieve data from a database is called a query (because it searches the database to find the answer to some question), and the results returned by a query are typically expressed in one of two forms: either as a single row of data values or as a set of rows of data values, otherwise known as a result data set (or result set). If no data values that correspond to the query specification provided can be found in the database, an empty result data set will be returned.

All queries begin with the SELECT SQL statement, which is an extremely powerful statement that is used to construct a wide variety of queries containing an infinite number of variations (using a finite set of rules). And because the SELECT statement is recursive, a single SELECT statement can derive its output from a successive number of nested SELECT statements (which are known as subqueries). (Earlier, we saw how SELECT statements can be used to provide input to INSERT and UPDATE statements; SELECT statements can be used to provide input to other SELECT statements in a similar manner.)

In its simplest form, the syntax for the SELECT statement is:

 SELECT * FROM [ [TableName]|[ViewName]]

where:

TableName	Identifies the name assigned to the table from which data is to be retrieved.
ViewName	Identifies the name assigned to the view from which data is to be retrieved.

Consequently, if you wanted to retrieve all values stored in a base table named DEPARTMENT, you could do so by executing a SELECT statement that looks something like this:

 SELECT * FROM department

If you executed this query against the SAMPLE database provided with DB2, you would get a result data set that looks something like this:

 DEPTNO   DEPTNAME                 MGRNO  ADMRDEPT LOCATION ------   ----------------------   ------ -------- -------- A00      COMPUTER SERVICE DIV.    000010 A00      - B01      PLANNING                 000020 A00      - C01      INFORMATION CENTER       000030 A00      - D01      DEVELOPMENT CENTER       -      A00      - D11      MANUFACTURING SYSTEMS    000060 D01      - D21      ADMINISTRATION SYSTEMS   000070 D01      - E01      SUPPORT SERVICES         000050 A00      - E11      OPERATIONS               000090 E01      - E21      SOFTWARE SUPPORT         000100 E01      - F22      BRANCH OFFICE F2         -      E01      - G22      BRANCH OFFICE G2         -      E01      - H22      BRANCH OFFICE H2         -      E01      - I22      BRANCH OFFICE I2         -      E01      - J22      BRANCH OFFICE J2         -      E01      -  14 record(s) selected.