2.2 Categories of Syntax

     

To begin to use SQL, you must understand how statements are written. SQL syntax falls into four main categories. Each category is introduced in the following list and then explained in further detail in the sections that follow:


Identifiers

Describe a user - or system-supplied name for database objects, such as a database, a table, a constraint on a table, the columns in a table, a view, etc.


Literals

Describe a user- or system-supplied string or value that is not otherwise an identifier or a keyword. Literals may be strings like "hello", numbers like 1234, dates like "Jan 01, 2002", or Boolean values like TRUE.


Operators

Are symbols specifying an action performed on one or more expressions, most often in DELETE , INSERT , SELECT , or UPDATE statements. Operators are also used frequently in the creation of database objects.


Reserved words and keywords

Have special meaning to the database SQL parser, such as SELECT , GRANT , DELETE , or CREATE . Reserved words are words that cannot be used as identifiers within the database platform, usually commands or SQL statements. Keywords are words that may become reserved some time in the future. You can circumvent the restriction on using reserved words and keywords as identifiers through quoted identifiers , described in a moment. This is not recommended! Elsewhere in the book (besides here), we use the term keyword to describe both concepts.

2.2.1 Identifiers

Keep in mind that RDBMSs are built upon set theory. In ANSI terms, clusters contain sets of catalogs, catalogs contain sets of schemas, schemas contain sets of objects, and so on. Most database platforms use corollary terms: instances contain one or more databases; databases contain one or more schemas; schemas contain one or more tables, views, or stored procedures, and the privileges associated with each object. At each level of this structure, items require a unique name (that is, an identifier) so that they can be referenced by programs and system processes.This means that each object (whether a database, table, view, column, index, key, trigger, stored procedure, or constraint) in a RDBMS must be identified. When issuing the command that creates a database object, you must specify an identifier (i.e., a name) for that new object.

There are two important categories of rules that experienced programmers keep in mind when choosing an identifier for a given item:


Naming conventions

Include logical rules of thumb or naming conventions that ultimately create better database structures and data tracking. These are not so much required by SQL as they are the distilled experience of practiced programmers.


Identifier rules

Are those set by the SQL standard and implemented by the platforms. Such rules include characteristics like how long a name may be. These identifier conventions are covered for each vendor later in this chapter.

2.2.1.1 Naming conventions

Naming conventions establish a standard baseline for choosing object identifiers. In this section, we show a list of naming conventions (rules for picking your identifiers) that are based on long years of experience. The SQL standard has no comment on naming conventions outside of the uniqueness of an identifier, its length, and the characters that are valid within the identifier:


Select a name that is meaningful, relevant, and descriptive

Do not name a table XP03 ; instead, name it Expenses_2005 , showing that it stores expenses for the year 2005. Remember that other people will likely be using the table or database too, perhaps long after you have gone, and the names should make sense at a glance. Each database vendor has limits on object name size , but names generally may be long enough to make sense to anyone reading them.


Choose and apply the same case throughout

Use either all uppercase or all lowercase for all objects throughout the database. Some database servers are case-sensitive, so using mixed-case identifiers might cause problems later.


Use abbreviations consistently

Once an abbreviation has been chosen , it should be used consistently throughout the database. For example, if EMP is used as an abbreviation for EMPLOYEE, then EMP should be used throughout the database. Do not use EMP in some places and EMPLOYEE in others.


Use complete, descriptive, meaningful names with underscores for reading clarity

A column name UPPERCASEWITHUNDERSCORES is not as easy to read as UPPERCASE_WITH_UNDERSCORES .


Do not put company or product names in database object names

Companies get acquired and products change names. These elements are too transitory to be included in database object names.


Do not use overly obvious prefixes or suffixes

For example, don't use "DB_" as a prefix for a database, and don't prefix every view with "V_". Simple queries to the system table of the database can tell the DBA or database programmer what type of object an identifier represents.


Do not fill up all available space for the object name

If the database platform allows a 32-character table name, try to leave at least a few free characters at the end. Some database platforms sometimes append prefixes or suffixes to table names when manipulating temporary copies of the tables.


Do not use quoted identifiers

Quoted identifiers are object names stored within double-quotes. (The ANSI standard calls these delimited identifiers .) Encapsulating an identifier within double-quotes allows you to create names that will be difficult to use and that may cause you problems later. Quoted identifiers are also case-sensitive. For example, you could embed spaces, special characters, mixed case characters, or even escape sequences within a quoted identifier, but because some third-party tools (and even vendor-supplied tools) cannot handle special characters in names, you should not make extensive use of quoted identifiers. Some platforms allow other delimiting symbols than double-quotes. For example, SQL Server uses brackets [ ] to designate quoted identifiers.

There are several benefits to following these naming conventions. First, your SQL code becomes, in a sense, self-documenting because the chosen names are meaningful and understandable to other users. Second, your SQL code and database objects are easier to maintain ” especially for other users who come after you ”because your objects are consistently named. Finally, maintaining consistency increases database functionality. If the database ever has to be transferred or migrated to another RDMS, consistent and descriptive naming saves both time and energy. Giving a few minutes of thought to naming SQL objects in the beginning can prevent problems later on.

2.2.1.2 Identifier rules

Identifier rules are rules for identifying objects, enforced by the database platforms, within the database. These rules apply to normal identifiers, not quoted identifiers. Rules specified by the SQL2003 standard generally differ somewhat from those of a specific database vendor. Table 2-1 contrasts the SQL2003 rules with those of the five RDMS platforms covered in this book.

Table 2-1. Platform-specific rules for regular object identifiers (excludes quoted identifiers)

Characteristic

Platform

Specification

Identifier size

SQL2003

128 characters

 

DB2

128 characters, depending on the object

 

MySQL

64 characters

 

Oracle

30 bytes (number of characters depends on the character set); database names are limited to 8 bytes

 

PostgreSQL

31 characters (NAMEDATALEN property minus 1)

 

SQL Server

128 characters (temp tables are limited to 116 characters)

Identifier may contain

SQL2003

Any number, character, or underscore

 

DB2

Any number, uppercase character, digit, or the underscore character

 

MySQL

Any number, character, or symbol

 

Oracle

Any number, character, and the underscore (_), pound (#), and dollar ($) symbols

 

PostgreSQL

Any number, character, or the underscore (_) symbol

 

SQL Server

Any number, character, and the underscore (_), at sign (@), pound (#), and dollar ($) symbols

Identifier must begin with

SQL2003

A letter

 

DB2

A letter

 

MySQL

A letter or number (but cannot be composed entirely of numbers)

 

Oracle

A letter

 

PostgreSQL

A letter or underscore (_)

 

SQL Server

A letter, underscore (_), at sign (@), or pound (#)

Identifier cannot contain

SQL2003

Spaces or special characters

 

DB2

Spaces or special characters

 

MySQL

Period (.), slash (/), or ASCII(0) and ASCII(255). Quote (') and double-quote (") are only allowed in quoted identifiers.

 

Oracle

Spaces, double-quotes ("), or special characters

 

PostgreSQL

Double-quote (")

 

SQL Server

Spaces or special characters

Allows quoted identifiers

SQL2003

Yes

 

DB2

Yes

 

MySQL

Yes

 

Oracle

Yes

 

PostgreSQL

Yes

 

SQL Server

Yes

Quoted identifier symbol

SQL2003

Double-quote ( " )

 

DB2

Double-quote (")

 

MySQL

Quote ( ' ) or double-quote (" ) in ANSI compatibility mode

 

Oracle

Double-quote (")

 

PostgreSQL

Double-quote (")

 

SQL Server

Double-quote (") or brackets ( [ ] ); brackets are preferred

Identifier may be reserved

SQL2003

No, unless as a quoted identifier

 

DB2

Yes

 

MySQL

No, unless as a quoted identifier

 

Oracle

No, unless as a quoted identifier

 

PostgreSQL

No, unless as a quoted identifier

 

SQL Server

No, unless as a quoted identifier

Schema addressing

SQL2003

Catalog.schema.object

 

DB2

Schema.object

 

MySQL

Database.object

 

Oracle

Schema.object

 

PostgreSQL

Database.schema.object

 

SQL Server

Server.database.schema.object

Identifier must be unique

SQL2003

Yes

 

DB2

Yes

 

MySQL

Yes

 

Oracle

Yes

 

PostgreSQL

Yes

 

SQL Server

Yes

Other rules

SQL2003

None

 

DB2

None

 

MySQL

May not contain numbers only

 

Oracle

Database links are limited to 128 bytes and may not be quoted identifiers

 

PostgreSQL

None

 

SQL Server

None


Identifiers must be unique within their scope. Thus, in our earlier discussion of the hierarchy of database objects, database names must be unique on a given instance of a database server, while the names of tables, views, functions, triggers, and stored procedures must be unique within a given schema. On the other hand, you can have a table and a stored procedure with the same name since they are different object types. The names of columns, keys, and indexes must be unique on a single table or view, and so forth. Check with the database platform documentation for more information ” some platforms require unique identifiers where others may not. For example, DB2 requires all index identifiers to be unique throughout the database, while SQL Server requires that the index identifier be unique only for the table it depends on.

Remember, quoted identifiers (object names encapsulated within a special delimiter , usually double quotes) may be used to break some of the identifier rules specified earlier. Specifically, quoted identifiers may be used to bestow a reserved word as a name, or to allow normally unusable characters and symbols within a name. For example, you normally can't use the percent sign ( % ) in a table name. However, you can, if you must, use the percent sign in a table name so long as you always enclose that table name within double quotes. To name a table expense%%ratios , you would specify the name in quotes as " expense%%ratios ". Again, remember that in SQL2003, such names are sometimes known as delimited identifiers.

Once you have created an object name as a quoted identifier, we recommend that you always reference it using its special delimiter.


2.2.2 Literals

SQL evaluates literal values as any explicit numeric, character string, temporal value (like a date or time), or Boolean value that is not an identifier or a keyword. SQL databases allow a variety of literal values in a SQL program. Literal values are allowed for most of the numeric, character, Boolean, and date datatypes. For example, SQL Server numeric datatypes include (among others) INTEGER , REAL , and MONEY . Thus, numeric literals can look like:

 30 -117 +883.3338 -6.66 000 2E5 7E-3 

As the example illustrates, SQL Server allows signed or unsigned numerals, in scientific or normal notation. And since SQL Server has a money datatype, even a dollar sign can be included. SQL Server does not allow other symbols in numeric literals (besides 0 1 2 3 4 5 6 7 8 9 + - $ . E e), so do not include commas (or periods in Europe). Most databases interpret a comma in a numeric literal as a list item separator . Thus, the literal value 3,000 would be interpreted by the database as 3 and, separately, 000.

Boolean, character string, and date literals look like:

 TRUE 'Hello world!' 'OCT-28-1966 22:14:30:00' 

Character string literals should always be enclosed by single quotation marks (' '), the standard delimiter for all character string literals. Character string literals are not restricted just to the alphabet. In fact, any character in the character set can be represented as a string literal. All of the following are string literals:

 '1998' '70,000 + 14000' 'There once was a man from Nantucket,' 'Oct 28, 1966' 

All of these examples are, in fact, compatible with the CHARACTER datatype. Remember not to confuse the string literal `1998' with the numeric literal 1998. Once string literals are associated with CHARACTER datatypes, it is poor practice to use them in arithmetic operations without explicitly converting them to a numeric datatype. Some database products will perform automatic conversion of string literals containing numbers when compared against any DATE or NUMBER datatype values.

By doubling the delimiter, you can effectively represent a single quotation mark in a literal string, if necessary. That is, use two quotation marks each time a single quotation mark is part of the value. This example taken from SQL Server illustrates the idea:

 SELECT 'So he said ''Who''s Le Petomaine?''' 

This gives the result:

 -----------------  So he said 'Who's Le Petomaine?' 

2.2.3 Operators

An operator is a symbol specifying an action that is performed on one or more expressions. Operators are used most often in DELETE , INSERT , SELECT , or UPDATE statements, but also are used frequently in the creation of database objects, such as stored procedures, functions, triggers, and views.

Operators typically fall into these categories:


Arithmetic operators

Supported by all databases


Assignment operators

Supported by all databases


Bitwise operators

Supported by Microsoft SQL Server


Comparison operators

Supported by all databases


Logical operators

Supported by DB2, Oracle, SQL Server, and PostgreSQL


Unary operators

Supported by DB2, Oracle, and SQL Server

2.2.3.1 Arithmetic operators

Arithmetic operators perform mathematical operations on two expressions of any datatypes in the numeric datatype category. See Table 2-2 for a listing of the arithmetic operators.

Table 2-2. Arithmetic operators

Arithmetic operator

Meaning

+

Addition

-

Subtraction

*

Multiplication

/

Division

%

Modula (SQL Server only); returns the remainder of a division operation as an integer value


In DB2, Oracle, and SQL Server, the + and - operators also can be used to perform arithmetic operations on date values.


2.2.3.2 Assignment operators

Except in Oracle, which uses := , the assignment operator ( = ) assigns a value to a variable or the alias of a column heading. In SQL Server, the keyword AS may serve as an operator for assigning table- or column-heading aliases.

2.2.3.3 Bitwise operators

Microsoft SQL Server provides bitwise operators as a shortcut to perform bit manipulations between two-integer expressions (see Table 2-3). Valid datatypes that are accessible to bitwise operators include binary , bit , int , smallint , tinyint , and varbinary .

Table 2-3. Bitwise operators

Bitwise operator

Meaning

&

Bitwise AND (two operands)

Bitwise OR (two operands)

^

Bitwise exclusive OR (two operands)


2.2.3.4 Comparison operators

Comparison operators test whether two expressions are equal or unequal . The result of a comparison operation is a Boolean value: TRUE , FALSE , or UNKNOWN . Also, note that the ANSI standard behavior for a comparison operation where one or more of the expressions are NULL is NULL. For example, the expression 23 + NULL returns NULL, as does the expression Feb 23, 2002 + NULL . See Table 2-4 for a list of the comparison operators.

Table 2-4. Comparison operators

Comparison operator

Meaning

=

Equal to

>

Greater than

<

Less than

>=

Greater than or equal to

<=

Less than or equal to

<>

Not equal to

!=

Not equal to (not ANSI standard)

!<

Not less than (not ANSI standard)

!>

Not greater than (not ANSI standard)


Boolean comparison operators are used most frequently in a WHERE clause to filter the rows that qualify for the search conditions. The following Microsoft SQL Server example uses the greater than or equal to comparison operation:

 SELECT *    FROM Products    WHERE ProductID >= @MyProduct 

2.2.3.5 Logical operators

Logical operators are commonly used in a WHERE clause to test for the truth of some condition. Logical operators return a Boolean value of either TRUE or FALSE . Logical operators also are discussed in SELECT Statement. Not all database systems support all operators. See Table 2-5 for a list of logical operators.

Table 2-5. Logical operators

Logical operator

Meaning

ALL

TRUE if all of a set of comparisons are TRUE

AND

TRUE if both Boolean expressions are TRUE

ANY

TRUE if any one of a set of comparisons is TRUE

BETWEEN

TRUE if the operand is within a range

EXISTS

TRUE if a subquery contains any rows

IN

TRUE if the operand is equal to one of a list of expressions or one or more rows returned by a subquery

LIKE

TRUE if the operand matches a pattern

NOT

Reverses the value of any other Boolean operator

OR

TRUE if either Boolean expression is TRUE

SOME

TRUE if some of a set of comparisons is TRUE


2.2.3.6 Unary operators

Unary operators perform an operation on only one expression of any of the datatypes of the numeric datatype category. Unary operators may be used on integer datatypes, though positive and negative may be used on any numeric datatype (see Table 2-6).

Table 2-6. Unary operators

Unary operator

Meaning

+

Numeric value is positive

-

Numeric value is negative

~

A bitwise NOT, returns the complement of the number (not in Oracle or DB2)


2.2.3.7 Operator precedence

Sometimes operator expressions become rather complex. When an expression has multiple operators, operator precedence determines the sequence in which the operations are performed. The order of execution can significantly affect the resulting value.

Operators have the precedence levels listed below. An operator on a higher level is evaluated before an operator on a lower level. The following listing denotes operators from highest to lowest precedence:

  • ( ) (parenthetical expressions)

  • + , - , ~ (unary operators)

  • * , / , % (mathematical operators)

  • + , - (arithmetic operators)

  • = , > , < , >= , <= , <> , != , !> , !< (comparison operators)

  • ^ (Bitwise Exclusive OR), & (Bitwise AND), (Bitwise OR)

  • NOT

  • AND

  • ALL, ANY, BETWEEN, IN, LIKE, OR, SOME

  • = (variable assignment)

Operators are evaluated from left to right when they are of equal precedence. However, parentheses are used to override the default precedence of the operators in an expression. Expressions within a parentheses are evaluated first, while operations outside the parentheses are evaluated next .

For example, the following expressions in an Oracle query return very different results:

 SELECT 2 * 4 + 5 FROM dual -- Evaluates to 8 + 5 which yields an expression result of 13. SELECT 2 * (4 + 5) FROM dual -- Evaluates to 2 * 9 which yields an expression result of 18. 

In expressions with nested parentheses, the most deeply nested expression is evaluated first.

This next example contains nested parentheses, with the expression 5 - 3 in the most deeply nested set of parentheses. This expression yields a value of 2. Then the addition operator (+) adds this result to 4, which yields a value of 6. Finally, the 6 is multiplied by 2 to yield an expression result of 12:

 SELECT 2 * (4 + (5 - 3) ) FROM dual -- Evaluates to 2 * (4 + 2) which further evaluates to 2 * 6,  --and yields an expression result of 12. RETURN 

2.2.3.8 System delimiters and operators

String delimiters mark the boundaries of a string of alphanumeric characters. System delimiters are those symbols within the character set that have special significance to your database server. Delimiters are symbols that are used to judge the order or hierarchy of processes and list items. Operators are those delimiters used to judge values in comparison operations, including symbols commonly used for arithmetic or mathematical operations. Table 2-7 lists the system delimiters and operators allowed by SQL.

Table 2-7. SQL delimiters and operators

Symbol

Usage

Example

+

Addition operator; in SQL Server, also serves as a concatenation operator

On all database platforms:

 SELECT MAX(emp_id) + 1 FROM employee 

-

Subtraction operator; also a range indicator in CHECK constraints

As a subtraction operator:

 SELECT MIN(emp_id) - 1 FROM employee 

As a range operator, in a CHECK constraint:

 ALTER TABLE authors ADD CONSTRAINT authors_zip_num CHECK (zip LIKE '%[0-9]%') 

*

Multiplication operator

 SELECT salary * 0.05 AS 'bonus' FROM employee; 

/

Division operator

 SELECT salary / 12 AS 'monthly' FROM employee; 

=

Equality operator

 SELECT *  FROM employee WHERE lname = 'Fudd' 

<>

Inequality operators (!= is a nonstandard equivalent on several platforms)

On all platforms:

 SELECT * FROM employee WHERE lname <> 'Fudd' 

<

<=

Less-than operator

Less-than or equal-to operator

 SELECT lname, emp_id,     (salary * 0.05) AS bonus FROM employee WHERE (salary * 0.05) <= 10000   AND exempt_status < 3 

>

>=

Greater-than operator

Greater-than or equal-to operator

 SELECT lname, emp_id,    (salary * 0.025) AS bonus FROM employee WHERE (salary * 0.025) > 10000   AND exempt_status >= 4 

(

)

Expression, function calls, order of operations, and subquery delimiter

Expression:

 SELECT (salary / 12) AS monthly FROM employee WHERE exempt_status >= 4 

Function call:

 SELECT SUM(travel_expenses) FROM "expense%%ratios" 

   

Order of operations:

 SELECT (salary / 12) AS monthly,    ((salary / 12) / 2) AS biweekly FROM employee WHERE exempt_status >= 4 

   

Subquery:

 SELECT * FROM stores WHERE stor_id IN    (SELECT stor_id FROM sales 

WHERE ord_date > '01-JAN-2004')

%

Wildcard attribute indicator

(refer toLIKE Operator.)

 SELECT *  FROM employee WHERE lname LIKE 'Fud%' 

,

List item separator

 SELECT lname,    fname,    ssn,    hire_date FROM employee WHERE lname = 'Fudd' 

.

Identifier qualifier separator

 SELECT *  FROM  scott.employee WHERE lname LIKE 'Fud%' 

'

Character string indicators

 SELECT *  FROM  employee WHERE lname LIKE 'FUD%'    OR fname = 'ELMER' 

"

Quoted identifier indicators

 SELECT expense_date,      SUM(travel_expense)  FROM  "expense%%ratios" WHERE expense_date     BETWEEN '01-JAN-2004'         AND '01-APR-2004' 

Single-line comment delimiter (two dashes followed by a space)

 -- Finds all employees like Fudd, -- Fudge, and Fudston SELECT *  FROM employee WHERE lname LIKE 'Fud%' 

/*

*/

Beginning multiline comment delimiter

Ending multiline comment indicator

 /* Finds all employees like Fudd,    Fudge, and Fudston  */ SELECT *  FROM employee WHERE lname LIKE 'Fud%' 


2.2.4 Keywords and Reserved Words

Just as certain symbols have special meaning and functionality within SQL, certain words and phrases have special significance. SQL keywords are words whose meanings are so closely tied to the operation of the RDBMS that they should not be used for any other purpose; generally, they are words used in a SQL statement. (Note that they can be used as an identifier on most platforms, but they shouldn't be.) For example, the word "SELECT" is a reserved word and should not be used as a table name.

It is generally a good idea to avoid naming columns or tables after a keyword that occurs in any major platform, because database applications are frequently converted from one platform to another.


Reserved words , on the other hand, do not have special significance now, but they probably will in a future release. To emphasize the fact that keywords should not be used as an identifier, but nevertheless could be, the SQL standard calls them "nonreserved keywords." Reserved words and keywords are not always words used in SQL statements, but may be words commonly associated with database technology. For example, CASCADE is used to describe data manipulations that allow their action, such as a delete or update, to "flow down," or cascade, to any subordinant tables. Reserved words and keywords are widely published so that programmers will not use them as identifiers that will, at some later revision, cause a problem.

SQL2003 specifies its own list of reserved words and keywords. In addition, the database platform specify their own list of reserved words and keywords because they each have their own extensions to the SQL command set. SQL standard keywords, as well as the keywords in the different vendor implementations , are shown in Appendix B.



SQL in a Nutshell
SQL in a Nutshell (In a Nutshell (OReilly))
ISBN: 0596518846
EAN: 2147483647
Year: 2003
Pages: 78

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net