To begin to use SQL, you must understand how statements are written. SQL syntax falls into four main categories. Each category is introduced in the following list and then explained in further detail in the sections that follow: -
- Identifiers
-
Describe a user - or system-supplied name for database objects, such as a database, a table, a constraint on a table, the columns in a table, a view, etc. -
- Literals
-
Describe a user- or system-supplied string or value that is not otherwise an identifier or a keyword. Literals may be strings like "hello", numbers like 1234, dates like "Jan 01, 2002", or Boolean values like TRUE. -
- Operators
-
Are symbols specifying an action performed on one or more expressions, most often in DELETE , INSERT , SELECT , or UPDATE statements. Operators are also used frequently in the creation of database objects. -
- Reserved words and keywords
-
Have special meaning to the database SQL parser, such as SELECT , GRANT , DELETE , or CREATE . Reserved words are words that cannot be used as identifiers within the database platform, usually commands or SQL statements. Keywords are words that may become reserved some time in the future. You can circumvent the restriction on using reserved words and keywords as identifiers through quoted identifiers , described in a moment. This is not recommended! Elsewhere in the book (besides here), we use the term keyword to describe both concepts. 2.2.1 Identifiers Keep in mind that RDBMSs are built upon set theory. In ANSI terms, clusters contain sets of catalogs, catalogs contain sets of schemas, schemas contain sets of objects, and so on. Most database platforms use corollary terms: instances contain one or more databases; databases contain one or more schemas; schemas contain one or more tables, views, or stored procedures, and the privileges associated with each object. At each level of this structure, items require a unique name (that is, an identifier) so that they can be referenced by programs and system processes.This means that each object (whether a database, table, view, column, index, key, trigger, stored procedure, or constraint) in a RDBMS must be identified. When issuing the command that creates a database object, you must specify an identifier (i.e., a name) for that new object. There are two important categories of rules that experienced programmers keep in mind when choosing an identifier for a given item: -
- Naming conventions
-
Include logical rules of thumb or naming conventions that ultimately create better database structures and data tracking. These are not so much required by SQL as they are the distilled experience of practiced programmers. -
- Identifier rules
-
Are those set by the SQL standard and implemented by the platforms. Such rules include characteristics like how long a name may be. These identifier conventions are covered for each vendor later in this chapter. 2.2.1.1 Naming conventions Naming conventions establish a standard baseline for choosing object identifiers. In this section, we show a list of naming conventions (rules for picking your identifiers) that are based on long years of experience. The SQL standard has no comment on naming conventions outside of the uniqueness of an identifier, its length, and the characters that are valid within the identifier: -
- Select a name that is meaningful, relevant, and descriptive
-
Do not name a table XP03 ; instead, name it Expenses_2005 , showing that it stores expenses for the year 2005. Remember that other people will likely be using the table or database too, perhaps long after you have gone, and the names should make sense at a glance. Each database vendor has limits on object name size , but names generally may be long enough to make sense to anyone reading them. -
- Choose and apply the same case throughout
-
Use either all uppercase or all lowercase for all objects throughout the database. Some database servers are case-sensitive, so using mixed-case identifiers might cause problems later. -
- Use abbreviations consistently
-
Once an abbreviation has been chosen , it should be used consistently throughout the database. For example, if EMP is used as an abbreviation for EMPLOYEE, then EMP should be used throughout the database. Do not use EMP in some places and EMPLOYEE in others. -
- Use complete, descriptive, meaningful names with underscores for reading clarity
-
A column name UPPERCASEWITHUNDERSCORES is not as easy to read as UPPERCASE_WITH_UNDERSCORES . -
- Do not put company or product names in database object names
-
Companies get acquired and products change names. These elements are too transitory to be included in database object names. -
- Do not use overly obvious prefixes or suffixes
-
For example, don't use "DB_" as a prefix for a database, and don't prefix every view with "V_". Simple queries to the system table of the database can tell the DBA or database programmer what type of object an identifier represents. -
- Do not fill up all available space for the object name
-
If the database platform allows a 32-character table name, try to leave at least a few free characters at the end. Some database platforms sometimes append prefixes or suffixes to table names when manipulating temporary copies of the tables. -
- Do not use quoted identifiers
-
Quoted identifiers are object names stored within double-quotes. (The ANSI standard calls these delimited identifiers .) Encapsulating an identifier within double-quotes allows you to create names that will be difficult to use and that may cause you problems later. Quoted identifiers are also case-sensitive. For example, you could embed spaces, special characters, mixed case characters, or even escape sequences within a quoted identifier, but because some third-party tools (and even vendor-supplied tools) cannot handle special characters in names, you should not make extensive use of quoted identifiers. Some platforms allow other delimiting symbols than double-quotes. For example, SQL Server uses brackets [ ] to designate quoted identifiers. There are several benefits to following these naming conventions. First, your SQL code becomes, in a sense, self-documenting because the chosen names are meaningful and understandable to other users. Second, your SQL code and database objects are easier to maintain ” especially for other users who come after you ”because your objects are consistently named. Finally, maintaining consistency increases database functionality. If the database ever has to be transferred or migrated to another RDMS, consistent and descriptive naming saves both time and energy. Giving a few minutes of thought to naming SQL objects in the beginning can prevent problems later on. 2.2.1.2 Identifier rules Identifier rules are rules for identifying objects, enforced by the database platforms, within the database. These rules apply to normal identifiers, not quoted identifiers. Rules specified by the SQL2003 standard generally differ somewhat from those of a specific database vendor. Table 2-1 contrasts the SQL2003 rules with those of the five RDMS platforms covered in this book. Table 2-1. Platform-specific rules for regular object identifiers (excludes quoted identifiers) Characteristic | Platform | Specification | Identifier size | SQL2003 | 128 characters | | DB2 | 128 characters, depending on the object | | MySQL | 64 characters | | Oracle | 30 bytes (number of characters depends on the character set); database names are limited to 8 bytes | | PostgreSQL | 31 characters (NAMEDATALEN property minus 1) | | SQL Server | 128 characters (temp tables are limited to 116 characters) | Identifier may contain | SQL2003 | Any number, character, or underscore | | DB2 | Any number, uppercase character, digit, or the underscore character | | MySQL | Any number, character, or symbol | | Oracle | Any number, character, and the underscore (_), pound (#), and dollar ($) symbols | | PostgreSQL | Any number, character, or the underscore (_) symbol | | SQL Server | Any number, character, and the underscore (_), at sign (@), pound (#), and dollar ($) symbols | Identifier must begin with | SQL2003 | A letter | | DB2 | A letter | | MySQL | A letter or number (but cannot be composed entirely of numbers) | | Oracle | A letter | | PostgreSQL | A letter or underscore (_) | | SQL Server | A letter, underscore (_), at sign (@), or pound (#) | Identifier cannot contain | SQL2003 | Spaces or special characters | | DB2 | Spaces or special characters | | MySQL | Period (.), slash (/), or ASCII(0) and ASCII(255). Quote (') and double-quote (") are only allowed in quoted identifiers. | | Oracle | Spaces, double-quotes ("), or special characters | | PostgreSQL | Double-quote (") | | SQL Server | Spaces or special characters | Allows quoted identifiers | SQL2003 | Yes | | DB2 | Yes | | MySQL | Yes | | Oracle | Yes | | PostgreSQL | Yes | | SQL Server | Yes | Quoted identifier symbol | SQL2003 | Double-quote ( " ) | | DB2 | Double-quote (") | | MySQL | Quote ( ' ) or double-quote (" ) in ANSI compatibility mode | | Oracle | Double-quote (") | | PostgreSQL | Double-quote (") | | SQL Server | Double-quote (") or brackets ( [ ] ); brackets are preferred | Identifier may be reserved | SQL2003 | No, unless as a quoted identifier | | DB2 | Yes | | MySQL | No, unless as a quoted identifier | | Oracle | No, unless as a quoted identifier | | PostgreSQL | No, unless as a quoted identifier | | SQL Server | No, unless as a quoted identifier | Schema addressing | SQL2003 | Catalog.schema.object | | DB2 | Schema.object | | MySQL | Database.object | | Oracle | Schema.object | | PostgreSQL | Database.schema.object | | SQL Server | Server.database.schema.object | Identifier must be unique | SQL2003 | Yes | | DB2 | Yes | | MySQL | Yes | | Oracle | Yes | | PostgreSQL | Yes | | SQL Server | Yes | Other rules | SQL2003 | None | | DB2 | None | | MySQL | May not contain numbers only | | Oracle | Database links are limited to 128 bytes and may not be quoted identifiers | | PostgreSQL | None | | SQL Server | None | Identifiers must be unique within their scope. Thus, in our earlier discussion of the hierarchy of database objects, database names must be unique on a given instance of a database server, while the names of tables, views, functions, triggers, and stored procedures must be unique within a given schema. On the other hand, you can have a table and a stored procedure with the same name since they are different object types. The names of columns, keys, and indexes must be unique on a single table or view, and so forth. Check with the database platform documentation for more information ” some platforms require unique identifiers where others may not. For example, DB2 requires all index identifiers to be unique throughout the database, while SQL Server requires that the index identifier be unique only for the table it depends on. Remember, quoted identifiers (object names encapsulated within a special delimiter , usually double quotes) may be used to break some of the identifier rules specified earlier. Specifically, quoted identifiers may be used to bestow a reserved word as a name, or to allow normally unusable characters and symbols within a name. For example, you normally can't use the percent sign ( % ) in a table name. However, you can, if you must, use the percent sign in a table name so long as you always enclose that table name within double quotes. To name a table expense%%ratios , you would specify the name in quotes as " expense%%ratios ". Again, remember that in SQL2003, such names are sometimes known as delimited identifiers. | Once you have created an object name as a quoted identifier, we recommend that you always reference it using its special delimiter. | | 2.2.2 Literals SQL evaluates literal values as any explicit numeric, character string, temporal value (like a date or time), or Boolean value that is not an identifier or a keyword. SQL databases allow a variety of literal values in a SQL program. Literal values are allowed for most of the numeric, character, Boolean, and date datatypes. For example, SQL Server numeric datatypes include (among others) INTEGER , REAL , and MONEY . Thus, numeric literals can look like: 30 -117 +883.3338 -6.66 000 2E5 7E-3 As the example illustrates, SQL Server allows signed or unsigned numerals, in scientific or normal notation. And since SQL Server has a money datatype, even a dollar sign can be included. SQL Server does not allow other symbols in numeric literals (besides 0 1 2 3 4 5 6 7 8 9 + - $ . E e), so do not include commas (or periods in Europe). Most databases interpret a comma in a numeric literal as a list item separator . Thus, the literal value 3,000 would be interpreted by the database as 3 and, separately, 000. Boolean, character string, and date literals look like: TRUE 'Hello world!' 'OCT-28-1966 22:14:30:00' Character string literals should always be enclosed by single quotation marks (' '), the standard delimiter for all character string literals. Character string literals are not restricted just to the alphabet. In fact, any character in the character set can be represented as a string literal. All of the following are string literals: '1998' '70,000 + 14000' 'There once was a man from Nantucket,' 'Oct 28, 1966' All of these examples are, in fact, compatible with the CHARACTER datatype. Remember not to confuse the string literal `1998' with the numeric literal 1998. Once string literals are associated with CHARACTER datatypes, it is poor practice to use them in arithmetic operations without explicitly converting them to a numeric datatype. Some database products will perform automatic conversion of string literals containing numbers when compared against any DATE or NUMBER datatype values. By doubling the delimiter, you can effectively represent a single quotation mark in a literal string, if necessary. That is, use two quotation marks each time a single quotation mark is part of the value. This example taken from SQL Server illustrates the idea: SELECT 'So he said ''Who''s Le Petomaine?''' This gives the result: ----------------- So he said 'Who's Le Petomaine?' 2.2.3 Operators An operator is a symbol specifying an action that is performed on one or more expressions. Operators are used most often in DELETE , INSERT , SELECT , or UPDATE statements, but also are used frequently in the creation of database objects, such as stored procedures, functions, triggers, and views. Operators typically fall into these categories: -
- Arithmetic operators
-
Supported by all databases -
- Assignment operators
-
Supported by all databases -
- Bitwise operators
-
Supported by Microsoft SQL Server -
- Comparison operators
-
Supported by all databases -
- Logical operators
-
Supported by DB2, Oracle, SQL Server, and PostgreSQL -
- Unary operators
-
Supported by DB2, Oracle, and SQL Server 2.2.3.1 Arithmetic operators Arithmetic operators perform mathematical operations on two expressions of any datatypes in the numeric datatype category. See Table 2-2 for a listing of the arithmetic operators. Table 2-2. Arithmetic operators Arithmetic operator | Meaning | + | Addition | - | Subtraction | * | Multiplication | / | Division | % | Modula (SQL Server only); returns the remainder of a division operation as an integer value | | In DB2, Oracle, and SQL Server, the + and - operators also can be used to perform arithmetic operations on date values. | | 2.2.3.2 Assignment operators Except in Oracle, which uses := , the assignment operator ( = ) assigns a value to a variable or the alias of a column heading. In SQL Server, the keyword AS may serve as an operator for assigning table- or column-heading aliases. 2.2.3.3 Bitwise operators Microsoft SQL Server provides bitwise operators as a shortcut to perform bit manipulations between two-integer expressions (see Table 2-3). Valid datatypes that are accessible to bitwise operators include binary , bit , int , smallint , tinyint , and varbinary . Table 2-3. Bitwise operators Bitwise operator | Meaning | & | Bitwise AND (two operands) | | Bitwise OR (two operands) | ^ | Bitwise exclusive OR (two operands) | 2.2.3.4 Comparison operators Comparison operators test whether two expressions are equal or unequal . The result of a comparison operation is a Boolean value: TRUE , FALSE , or UNKNOWN . Also, note that the ANSI standard behavior for a comparison operation where one or more of the expressions are NULL is NULL. For example, the expression 23 + NULL returns NULL, as does the expression Feb 23, 2002 + NULL . See Table 2-4 for a list of the comparison operators. Table 2-4. Comparison operators Comparison operator | Meaning | = | Equal to | > | Greater than | < | Less than | >= | Greater than or equal to | <= | Less than or equal to | <> | Not equal to | != | Not equal to (not ANSI standard) | !< | Not less than (not ANSI standard) | !> | Not greater than (not ANSI standard) | Boolean comparison operators are used most frequently in a WHERE clause to filter the rows that qualify for the search conditions. The following Microsoft SQL Server example uses the greater than or equal to comparison operation: SELECT * FROM Products WHERE ProductID >= @MyProduct 2.2.3.5 Logical operators Logical operators are commonly used in a WHERE clause to test for the truth of some condition. Logical operators return a Boolean value of either TRUE or FALSE . Logical operators also are discussed in SELECT Statement. Not all database systems support all operators. See Table 2-5 for a list of logical operators. Table 2-5. Logical operators Logical operator | Meaning | ALL | TRUE if all of a set of comparisons are TRUE | AND | TRUE if both Boolean expressions are TRUE | ANY | TRUE if any one of a set of comparisons is TRUE | BETWEEN | TRUE if the operand is within a range | EXISTS | TRUE if a subquery contains any rows | IN | TRUE if the operand is equal to one of a list of expressions or one or more rows returned by a subquery | LIKE | TRUE if the operand matches a pattern | NOT | Reverses the value of any other Boolean operator | OR | TRUE if either Boolean expression is TRUE | SOME | TRUE if some of a set of comparisons is TRUE | 2.2.3.6 Unary operators Unary operators perform an operation on only one expression of any of the datatypes of the numeric datatype category. Unary operators may be used on integer datatypes, though positive and negative may be used on any numeric datatype (see Table 2-6). Table 2-6. Unary operators Unary operator | Meaning | + | Numeric value is positive | - | Numeric value is negative | ~ | A bitwise NOT, returns the complement of the number (not in Oracle or DB2) | 2.2.3.7 Operator precedence Sometimes operator expressions become rather complex. When an expression has multiple operators, operator precedence determines the sequence in which the operations are performed. The order of execution can significantly affect the resulting value. Operators have the precedence levels listed below. An operator on a higher level is evaluated before an operator on a lower level. The following listing denotes operators from highest to lowest precedence: -
( ) (parenthetical expressions) -
+ , - , ~ (unary operators) -
* , / , % (mathematical operators) -
+ , - (arithmetic operators) -
= , > , < , >= , <= , <> , != , !> , !< (comparison operators) -
^ (Bitwise Exclusive OR), & (Bitwise AND), (Bitwise OR) -
NOT -
AND -
ALL, ANY, BETWEEN, IN, LIKE, OR, SOME -
= (variable assignment) Operators are evaluated from left to right when they are of equal precedence. However, parentheses are used to override the default precedence of the operators in an expression. Expressions within a parentheses are evaluated first, while operations outside the parentheses are evaluated next . For example, the following expressions in an Oracle query return very different results: SELECT 2 * 4 + 5 FROM dual -- Evaluates to 8 + 5 which yields an expression result of 13. SELECT 2 * (4 + 5) FROM dual -- Evaluates to 2 * 9 which yields an expression result of 18. In expressions with nested parentheses, the most deeply nested expression is evaluated first. This next example contains nested parentheses, with the expression 5 - 3 in the most deeply nested set of parentheses. This expression yields a value of 2. Then the addition operator (+) adds this result to 4, which yields a value of 6. Finally, the 6 is multiplied by 2 to yield an expression result of 12: SELECT 2 * (4 + (5 - 3) ) FROM dual -- Evaluates to 2 * (4 + 2) which further evaluates to 2 * 6, --and yields an expression result of 12. RETURN 2.2.3.8 System delimiters and operators String delimiters mark the boundaries of a string of alphanumeric characters. System delimiters are those symbols within the character set that have special significance to your database server. Delimiters are symbols that are used to judge the order or hierarchy of processes and list items. Operators are those delimiters used to judge values in comparison operations, including symbols commonly used for arithmetic or mathematical operations. Table 2-7 lists the system delimiters and operators allowed by SQL. Table 2-7. SQL delimiters and operators Symbol | Usage | Example | + | Addition operator; in SQL Server, also serves as a concatenation operator | On all database platforms: SELECT MAX(emp_id) + 1 FROM employee | - | Subtraction operator; also a range indicator in CHECK constraints | As a subtraction operator: SELECT MIN(emp_id) - 1 FROM employee As a range operator, in a CHECK constraint: ALTER TABLE authors ADD CONSTRAINT authors_zip_num CHECK (zip LIKE '%[0-9]%') | * | Multiplication operator | SELECT salary * 0.05 AS 'bonus' FROM employee; | / | Division operator | SELECT salary / 12 AS 'monthly' FROM employee; | = | Equality operator | SELECT * FROM employee WHERE lname = 'Fudd' | <> | Inequality operators (!= is a nonstandard equivalent on several platforms) | On all platforms: SELECT * FROM employee WHERE lname <> 'Fudd' | < <= | Less-than operator Less-than or equal-to operator | SELECT lname, emp_id, (salary * 0.05) AS bonus FROM employee WHERE (salary * 0.05) <= 10000 AND exempt_status < 3 | > >= | Greater-than operator Greater-than or equal-to operator | SELECT lname, emp_id, (salary * 0.025) AS bonus FROM employee WHERE (salary * 0.025) > 10000 AND exempt_status >= 4 | ( ) | Expression, function calls, order of operations, and subquery delimiter | Expression: SELECT (salary / 12) AS monthly FROM employee WHERE exempt_status >= 4 Function call: SELECT SUM(travel_expenses) FROM "expense%%ratios" | | | Order of operations: SELECT (salary / 12) AS monthly, ((salary / 12) / 2) AS biweekly FROM employee WHERE exempt_status >= 4 | | | Subquery: SELECT * FROM stores WHERE stor_id IN (SELECT stor_id FROM sales WHERE ord_date > '01-JAN-2004') | % | Wildcard attribute indicator (refer toLIKE Operator.) | SELECT * FROM employee WHERE lname LIKE 'Fud%' | , | List item separator | SELECT lname, fname, ssn, hire_date FROM employee WHERE lname = 'Fudd' | . | Identifier qualifier separator | SELECT * FROM scott.employee WHERE lname LIKE 'Fud%' | ' | Character string indicators | SELECT * FROM employee WHERE lname LIKE 'FUD%' OR fname = 'ELMER' | " | Quoted identifier indicators | SELECT expense_date, SUM(travel_expense) FROM "expense%%ratios" WHERE expense_date BETWEEN '01-JAN-2004' AND '01-APR-2004' | ” | Single-line comment delimiter (two dashes followed by a space) | -- Finds all employees like Fudd, -- Fudge, and Fudston SELECT * FROM employee WHERE lname LIKE 'Fud%' | /* */ | Beginning multiline comment delimiter Ending multiline comment indicator | /* Finds all employees like Fudd, Fudge, and Fudston */ SELECT * FROM employee WHERE lname LIKE 'Fud%' | 2.2.4 Keywords and Reserved Words Just as certain symbols have special meaning and functionality within SQL, certain words and phrases have special significance. SQL keywords are words whose meanings are so closely tied to the operation of the RDBMS that they should not be used for any other purpose; generally, they are words used in a SQL statement. (Note that they can be used as an identifier on most platforms, but they shouldn't be.) For example, the word "SELECT" is a reserved word and should not be used as a table name. | It is generally a good idea to avoid naming columns or tables after a keyword that occurs in any major platform, because database applications are frequently converted from one platform to another. | | Reserved words , on the other hand, do not have special significance now, but they probably will in a future release. To emphasize the fact that keywords should not be used as an identifier, but nevertheless could be, the SQL standard calls them "nonreserved keywords." Reserved words and keywords are not always words used in SQL statements, but may be words commonly associated with database technology. For example, CASCADE is used to describe data manipulations that allow their action, such as a delete or update, to "flow down," or cascade, to any subordinant tables. Reserved words and keywords are widely published so that programmers will not use them as identifiers that will, at some later revision, cause a problem. SQL2003 specifies its own list of reserved words and keywords. In addition, the database platform specify their own list of reserved words and keywords because they each have their own extensions to the SQL command set. SQL standard keywords, as well as the keywords in the different vendor implementations , are shown in Appendix B. |